-
Berk Balci (CERN)25/05/2026, 13:45Track 4 - Distributed computingOral Presentation
Identity and Access Management (IAM) in a large scale research collaboration typically serves both organisational and distributed community needs. CERN operates at this intersection, balancing local institutional requirements with those of a worldwide ecosystem of scientific partners.
This presentation will outline the evolution of CERNโs Single Sign-On platform (based on Keycloak) and the...
Go to contribution page -
David Crooks, Dr David Crooks (UKRI STFC)25/05/2026, 14:03Track 4 - Distributed computingOral Presentation
The risk of cyber attack against members of the research and education sector remains persistently high, with several recent high visibility incidents including a well-reported ransomware attack against the British Library. As reported previously, we must work collaboratively to defend our community against such attacks, notably through the active use of threat intelligence shared with trusted...
Go to contribution page -
Francesco Giacomini (INFN CNAF)25/05/2026, 14:21Track 4 - Distributed computingOral Presentation
INDIGO IAM is an Identity and Access Management service providing authentication and authorization across distributed research infrastructures. It is a Spring Boot application relying on OAuth/OpenID Connect (OIDC) technologies and is currently evolving to meet increasingly stringent requirements in terms of security, interoperability and observability.
Go to contribution page
A key aspect is the progressive... -
Marcus Hardt, Dr Marcus Hardt (KIT)25/05/2026, 14:39Track 4 - Distributed computingOral Presentation
Traditional SSH key-based authentication presents significant scalability
Go to contribution page
and security challenges in modern federated research environments,
particularly regarding key distribution, lifecycle management, and access
revocation. This paper presents ssh-oidc, a novel approach that integrates
OpenID Connect (OIDC) authentication with SSH certificate-based access
control for scientific... -
Mr Tom Dack (STFC UKRI)25/05/2026, 14:57Track 4 - Distributed computingOral Presentation
The migration away from using X.509 towards token-based authentication within the Worldwide LHC Computing Grid (WLCG) infrastructure has required many redesigns of the various workflows, ranging from data management through to job submission, and various activities in between. To compound the complexity of this transition, different user groups within WLCG have adopted different token use...
Go to contribution page -
Jeff Templon25/05/2026, 16:15Track 4 - Distributed computingOral Presentation
This effort revisits the issue of scheduling multicore workloads on shared multipurpose, multi-user clusters. This issue was extensively studied and reported on for CHEP 2015. Since then, both the cluster-management technology and the typical grid-cluster workloads have evolved, with consequences for scheduling approaches.
Go to contribution page
The relevant developments will be discussed, and arguments made that... -
Maria-Elena Mihailescu (National University of Science and Technology POLITEHNICA Bucharest (RO))25/05/2026, 16:33Track 4 - Distributed computingOral Presentation
Authors: Maria-Elena Mihฤilescu (National University of Science and Technology Politehnica Bucharest, maria.mihailescu@upb.ro), Costin Grigoraศ (CERN, costin.grigoras@cern.ch), Latchezar Betev (CERN, latchezar.betev@cern.ch), Mihai Carabaศ (National University of Science and Technology Politehnica Bucharest, mihai.carabas@upb.ro)
on behalf of the ALICE CollaborationJAliEn functions as...
Go to contribution page -
CMS Collaboration25/05/2026, 16:51Track 4 - Distributed computingOral Presentation
The resource landscape available to LHC experiments is evolving, driven by industry trends and funding-agencies policies, from traditional WLCG sites dominated by x86 CPU resources, towards larger consolidated facilities, with a growing fraction of supercomputing centers, and a rising degree of hardware heterogeneity. The CMS experiment, which has already demonstrated substantial throughput...
Go to contribution page -
Dr Brij Kishor Jashal (Rutherford Appleton Laboratory)25/05/2026, 17:09Track 4 - Distributed computingOral Presentation
Managing job-slot allocation in a multi-VO environment remains a persistent operational challenge for WLCG sites, particularly when each Virtual Organization (VO) employs distinct workload-management and scheduling behaviors. At the RAL Tier-1 (RAL-LCG2), more than a dozen VOsโincluding CMS, ATLAS, LHCb, and several smaller communitiesโcompete for heterogeneous resources while relying on...
Go to contribution page -
Marta Bertran Ferrer (CERN)25/05/2026, 17:27Track 4 - Distributed computingOral Presentation
ALICE Grid sites employ heterogeneous resource allocation policies, where each configuration is tailored to the specific conditions of the sites, their user communities, and local scheduling preferences. The design and implementation of JAliEn have been specifically developed to be flexible and adaptable to these varied configurations and execution systems, allowing it to utilize the allocated...
Go to contribution page -
CMS Collaboration25/05/2026, 17:45Track 4 - Distributed computingOral Presentation
Efficient use of distributed computing resources is essential for sustaining the growing processing demands of the CMS experiment. Building on our previous work to assess and minimize unused CPU cycles, new advances in scheduling strategies that further improve resource utilization are being developed for the CMS Global Pool.
The CMS Submission Infrastructure team is deploying enhanced...
Go to contribution page -
Lorenzo Valentini (CERN)26/05/2026, 13:45Track 4 - Distributed computingOral Presentation
Distributed computing infrastructures that support modern large-scale scientific experiments must remain reliable, scalable, and flexible. HammerCloud (HC) provides an automated framework for continuous testing, benchmarking, and commissioning of services within the Worldwide LHC Computing Grid (WLCG), using realistic full-chain experiment workflows.
As the technical computing environment...
Go to contribution page -
Fernando Harald Barreiro Megino (University of Texas at Arlington)26/05/2026, 14:03Track 4 - Distributed computingOral Presentation
The ATLAS experiment at the CERN Large Hadron Collider relies on a worldwide distributed computing infrastructure to process millions of production and analysis jobs daily across grid, cloud, and HPC resources. The ATLAS Distributed Computing (ADC) system integrates workload, data, and resource management services to ensure efficient use of heterogeneous environments. Within ADC, the PanDA...
Go to contribution page -
Sakib Rahman, Sakib Rahman26/05/2026, 14:21Track 4 - Distributed computingOral Presentation
The ePIC experiment at the upcoming Electron-Ion Collider (EIC) continues to expand its simulation production capabilities on the Open Science Grid (OSG) infrastructure. We report on three significant developments since our previous work: the integration of background processes into simulation production, comprehensive testing of the PanDA workload management system, and progress in Rucio...
Go to contribution page -
Alexandre Franck Boyer (CERN)26/05/2026, 14:39Track 4 - Distributed computingOral Presentation
DiracX is the next incarnation of DIRAC. This is a modern, cloudโnative platform for managing distributed computing across multiple research infrastructures for one or more virtual organizations. Leveraging two decades of DIRAC experience, DiracX delivers a faster, more capable, and userโfriendly environment for scientists, administrators, and developers alike.
In this contribution we build...
Go to contribution page -
Andrea Piccinelli (University of Notre Dame (US))26/05/2026, 14:57Track 4 - Distributed computingOral Presentation
The Compact Muon Solenoid (CMS) experiment is reassessing its Workload Management (WM) stack to meet HL-LHC scale, heterogeneity, and a 20โ25-year sustainability horizon. Over the past year, we surveyed multiple pathways (including reuse of external WM systems, hybrid approaches, and a ground-up redesign) and developed a blueprint that emphasizes architectural principles of the HL-LHC WM...
Go to contribution page -
Zhengde Zhang (ไธญๅฝ็งๅญฆ้ข้ซ่ฝ็ฉ็็ ็ฉถๆ)26/05/2026, 16:15Track 4 - Distributed computingOral Presentation
We present Dr.Sai, a large language model (LLM)-powered multi-agent system designed to autonomously execute physics analysis at BESIII experiment. It interprets a physicistโs natural language request, decomposes it into tasks (e.g., data skimming, fitting), calls the appropriate scientific tools, and executes the workflow end-to-end. A demonstration will show Dr.Sai completing multiple simple...
Go to contribution page -
Pietro Lugato (Massachusetts Inst. of Technology (US))26/05/2026, 16:33Track 4 - Distributed computingOral Presentation
A2rchi (AI Augmented Research Chat Intelligence) is an open-source, end-to-end framework for building AI agents to automate research and operational workflows. Various groups have already applied the system to their use case; the most advanced is the Computing Operations (CompOps) team at the Compact Muon Solenoid (CMS) experiment at CERN. CompOps has a private, constantly evolving, and...
Go to contribution page -
Albert Gyorgy Borbely (University of Glasgow (GB))26/05/2026, 16:51Track 4 - Distributed computingOral Presentation
Recent developments demonstrate that HEP software can run effectively on
Go to contribution page
GPUs, while advances in ML models have shown predictable scaling laws
for compute, data, and model size, consistent with trends across the
wider AI community. As a result, there is growing demand within HEP for
inference using larger models that have already delivered significant
physics gains, such as b-tagging... -
Brian Paul Bockelman (University of Wisconsin Madison (US))26/05/2026, 17:09Track 4 - Distributed computingOral Presentation
For distributed High Throughput Computing (dHTC), the original -- and potentially still most popular -- interface for workflow management is the command line interface (CLI). Decades of researchers have been trained on the CLI and knowledgeable users can effectively integrate it into larger scripts with little friction. As the ecosystem has grown and matured, new interfaces have appeared...
Go to contribution page -
Sergio Andreozzi26/05/2026, 17:27Track 4 - Distributed computingOral Presentation
The EGI Federation, that emerged from WLCG in 2010, has been a cornerstone of European and global digital science for over 15 years, providing a federated e-infrastructure for 150,000+ researchers across all scientific disciplines. The recently approved โEGI Federation Strategy 2026โ2030โ sets out an ambitious plan for the next 5 years to ensure that EGI remains an accelerator for science....
Go to contribution page -
Mr Dhiraj Kalita (KEK (High Energy Accelerator Research Organization))26/05/2026, 17:45Track 4 - Distributed computingOral Presentation
The Belle II experiment at KEK, Japan, operates with data volume reaching over 30 petabytes, with datasets distributed and processed worldwide using DIRAC and Rucio. With the globally distributed computing infrastructure, and expecting an order of magnitude larger data volume, we face operational challenges for both computing experts and end-users. The end-users frequently struggle with...
Go to contribution page -
Stefano Dal Pra (INFN)27/05/2026, 13:45Track 4 - Distributed computingOral Presentation
We describe a set of tools developed to ease the execution of large computing campaigns over multiple and different computing resource providers. The tool suite has been adopted to perform the All-sky Continuous GW search on the data of the fourth LIGO-Virgo-KAGRA Observation cycle (O4), running CPU payloads on the IGWN Grid, INFN-CNAF, ICSC Grid (based on HTCondor , with different...
Go to contribution page -
Ryunosuke O'Neil (CERN)27/05/2026, 14:03Track 4 - Distributed computingOral Presentation
Delivering reproducible computational workflows across heterogeneous and distributed computing infrastructures remains a significant challenge for many scientific communities. Workflow standards such as the Common Workflow Language (CWL) offer a portable and declarative means to describe complex pipelines but their integration into large-scale, data-driven workload management systems remains...
Go to contribution page -
David Schultz (University of Wisconsin-Madison)27/05/2026, 14:21Track 4 - Distributed computingOral Presentation
After a long delay and false starts, the IceCube Neutrino Observatory has removed GridFTP and x509 certificate authentication. We have migrated to using the Pelican Platform, the Open Science Data Federation, and WLGC Tokens. While this is a common solution, we required several customizations to work with our existing data warehouse structure and make it easier for scientists to use. We...
Go to contribution page -
Panos Paparrigopoulos (CERN)27/05/2026, 14:39Track 4 - Distributed computingOral Presentation
The Computing Resource Information Catalogue (CRIC) is a central element of the WLCG information ecosystem and a key operational tool for ATLAS Distributed Computing, providing authoritative, experiment-oriented views of sites, services, data-management endpoints and configuration parameters across distributed infrastructures. In preparation for HL-LHC, CRIC has undergone a major evolution: a...
Go to contribution page -
Jingyan Shi (IHEP)27/05/2026, 14:57Track 4 - Distributed computingOral Presentation
The High Energy cosmic Radiation Detection facility (HERD) is a long-term space-based high-energy physics experiment onboard the China Space Station, expected to produce large and heterogeneous datasets, including flight data, simulation data, and multi-version reconstructed data. To efficiently support large-scale computing and long-term physics analysis, a unified data management and...
Go to contribution page -
Dr Stefano Bagnasco (Istituto Nazionale di Fisica Nucleare, Torino)27/05/2026, 16:15Track 4 - Distributed computingOral Presentation
The LIGOโVirgoโKAGRA (LVK) Collaboration closed its fourth observation period (O4) in November 2025, its longest and richest to date. During O4, the detectors observed roughly 250 gravitational-wave candidate signals in real time, and more are extracted from the data by offline analysis. Outstanding results were, for example, the first detection of โsecond generationโ black holes, in which the...
Go to contribution page -
Paul James Laycock (Universite de Geneve (CH))27/05/2026, 16:33Track 4 - Distributed computingOral Presentation
The Einstein Telescope (ET) will be the next-generation European underground Gravitational Wave (GW) observatory, designed to open a new observational window on the Universe starting in the mid to late 2030s. Building upon the experience of current GW detectors such as LIGO and Virgo, ET will achieve a significant increase in sensitivity, enabling the detection of a much larger number of GW...
Go to contribution page -
Ivan Glushkov (Brookhaven National Laboratory (US))27/05/2026, 16:51Track 4 - Distributed computingOral Presentation
ATLAS Distributed Computing (ADC) is the set of infrastructure, software stack and experts that handle up to 1 million computing slots and over 1 EB of stored data in order to facilitate the computing needs of the ATLAS experiment at the LHC accelerator. After short description of the ADC structure and operational performance, this contribution focuses on the latest ADC innovations as well as...
Go to contribution page -
Holly Szumila-Vance (Florida International University)27/05/2026, 17:09Track 4 - Distributed computingOral Presentation
The ePIC collaboration is developing a highly integrated, multi-purpose detector for the upcoming Electron-Ion Collider (EIC). A co-design approach between the detector and the computing enables a seamless data flow from detector readout to physics analysis, using streaming readout and AI. This system is aimed at accelerating scientific discovery and improving measurement precision through...
Go to contribution page -
Xiaomei Zhang (Chinese Academy of Sciences (CN))27/05/2026, 17:27Track 4 - Distributed computingOral Presentation
The Jiangmen Underground Neutrino Observatory (JUNO) commenced physics data taking in August 2025, marking the transition from commissioning to full-scale operation of its Distributed Computing Infrastructure (DCI) system for real physics data. This contribution presents the Monte Carlo production and physics production experience accumulated during the first year of data taking.
Go to contribution page
We provide... -
Artem Petrosyan (Joint Institute for Nuclear Research (RU))27/05/2026, 17:45Track 4 - Distributed computingOral Presentation
The SPD (Spin Physics Detector) facility is currently under construction as part of the NICA complex at JINR. In parallel with the physical infrastructure, the experimentโs software ecosystem is being developed to meet the growing need for large-scale simulation of physical processes.
As an international collaboration, SPD leverages the distributed computing resources contributed by its...
Go to contribution page -
CMS Collaboration28/05/2026, 13:45Track 4 - Distributed computingOral Presentation
For a few years INFN has been investing effort in exploring technologies to seamlessly integrate distributed resources to effectively enable high-rate data analysis patterns supporting interactive and/or quasi-interactive analysis of sizable amounts of data. One of the main drivers for this initiative is to contribute to the R&D activities for the evolution of the analysis computing model for...
Go to contribution page -
Doug Benjamin (Brookhaven National Laboratory (US)), Douglas Benjamin28/05/2026, 14:03Track 4 - Distributed computingOral Presentation
High energy physics (HEP) workflows are approaching the throughput limits of traditional grid/HTC computing, as LHC and DUNE are driving O(10โ100)ร data growth and increased GPU demand. This motivates a practical path to routinely use leadership-class HPC resources remotely. One of the challenges is the varied authentication, authorization and job submission mechanisms at different HPC...
Go to contribution page -
Ozgur Ozan Kilic (Brookhaven National Laboratory)28/05/2026, 14:21Track 4 - Distributed computingOral Presentation
As HEP experiments increasingly rely on diverse computing resources across multiple facilities, sustainable workflow orchestration that bridges experiment-native tools with facility-native interfaces becomes critical. This work develops and evaluates a generalizable approach to cross-facility workflow integration, using the DUNE 2ร2 Near Detector simulation as a challenging demonstrator case....
Go to contribution page -
Valentin Volkl (CERN)28/05/2026, 14:39Track 4 - Distributed computingOral Presentation
The CernVM-Filesystem (CVMFS) is a global, read-only, on-demand filesystem optimized for software distribution. Its on-demand nature is well adapted and extremely efficient for distributed batch computing, but can mean noticeable latency in interactive use, especially when working with applications such as python that load a large number of small file on startup.
In this contribution we...
Go to contribution page -
Marta Bertran Ferrer (CERN)28/05/2026, 14:57Track 4 - Distributed computingOral Presentation
Effective tools for monitoring Grid workflow executions are crucial for the prompt identification of issues, which in turn facilitates the design and deployment of appropriate solutions. The ALICE Grid middleware JAliEn utilizes the MonALISA framework to monitor all its Grid components, which collectively generate an enormous amount of data - about 200,000 monitored parameters per second...
Go to contribution page -
Carlos Borrajo Gomez (CERN)28/05/2026, 16:15Track 4 - Distributed computingOral Presentation
As part of the Run 3 of the Large Hadron Collider (LHC), the CMS experiment generates large amounts of data that have to be processed and stored efficiently. The complex distributed computing infrastructure used for these purposes has to be highly available, and having a reliable and comprehensive monitoring setup is essential for it. The CMS monitoring team is responsible for providing the...
Go to contribution page -
Panos Paparrigopoulos (CERN)28/05/2026, 16:33Track 4 - Distributed computingOral Presentation
The WLCG infrastructure is evolving to support the HL-LHC, requiring greater capacity and increasingly diverse resource types, which challenges the existing accounting system to become more flexible in handling heterogeneous resources such as GPUs and in incorporating new metrics, including environmental and sustainability indicators. The current system relies on outdated and overly complex...
Go to contribution page -
Marta Bertran Ferrer (CERN)28/05/2026, 16:51Track 4 - Distributed computingOral Presentation
The ALICE Grid incorporates a large volume of heterogeneous resources, including systems with a diverse range of CPU and GPU resources, various operating system versions, and differing hardware architectures. The Central Grid Operation team lacks direct access to the individual clusters and nodes that compose the Grid, which presents numerous challenges to fully understanding and optimizing...
Go to contribution page -
Maksim Melnik Storetvedt (Western Norway University of Applied Sciences (NO))28/05/2026, 17:09Track 4 - Distributed computingOral Presentation
The ALICE Collaboration actively relies on accelerators, such as GPUs, to handle increasingly complex workflows and data rates. Such resources have rapidly risen in importance across a number of usecases, and their emergence can be reflected in their availability in the WLCG. Through broader vendor support, as well as improved matching techniques, the ALICE Grid middleware may allocate and use...
Go to contribution page -
Borja Garrido Bear (CERN)28/05/2026, 17:27Track 4 - Distributed computingOral Presentation
We present the evolution of the CERN IT Monitoring (MONIT) architecture for the CERN Data Centres and WLCG Infrastructure monitoring use cases, and how it has been updated to improve scalability, interoperability, and observability. Prometheus has been introduced as the core metrics collection and aggregation system and the previous Collectd-based framework is being replaced by Prometheus...
Go to contribution page -
Raghuvar Vijayakumar (University of Freiburg (DE))28/05/2026, 17:45Track 4 - Distributed computingOral Presentation
Distributed computing infrastructures are shared by multiple research communities, particularly within High Energy Physics (HEP), where precise and transparent resource accounting is critical. To meet these demands, we developed AUDITOR (AccoUnting DatahandlIng Toolbox for Opportunistic Resources), a flexible, modular, and extensible accounting ecosystem designed for heterogeneous computing...
Go to contribution page
Choose timezone
Your profile timezone: