WLCG Management Board #340

Europe/Zurich
513/R-068 (CERN)

513/R-068

CERN

19
Show room on map
Description

16:00 CERN/10:00 EDT/09:00 CDT

URL for the ZOOM call: https://cern.zoom.us/j/66011416092  (the passcode has been distributed by email)

To join by phone (see the dedicated support page):
1) dial your local ZOOM number (e.g. +41432107108) then enter the MB meeting ID  660 1141 6092 then #
2) mute your phone (there is no auto-mute facility)
Note: calls are charged to the caller (there is no callback facility)

Email Distribution List: worldwide-lcg-management-board@cern.ch
Email List Archive: worldwide-lcg-management-board (requires CERN authentication)
Minutes: Management Board Meeting Minutes
Action List: MbActionList
See also the: WLCG Document RepositoryWLCG Web Site

Zoom Meeting ID
66011416092
Host
Andrea Sciabà
Alternative hosts
Simone Campana, Maarten Litmaath
Useful links
Join via phone
Zoom URL
    • 4:00 PM 4:25 PM
      Minutes and Matters Arising 25m
      • WLCG Matters Arising 10m
        Speaker: Dr Tommaso Boccali (INFN Sezione di Pisa)
        • T1s/FAs: please remember to send your view on the strategy for the increased IT costs. We need it for the next LHCC!

        • We had a meeting with the WLCG CB Chairperson, in order to schedule our next actions. For the revision of Annex 5, which has been in the pipeline since some months, we propose:

          • We (WLCG management) go back and look at the comments left last year, and since they were mostly from 2-3 persons, we have a chat with them

          • Then, we propose (possibly by the next MB, but these are busy times) a general light review and an approval

          • Main open points: the LCG (CERN project)/WLCG (Collaboration) status and a clear procedure for the onboarding of Associated Partners

          • Indeed, we got a contact from EICO (EIC International Computing Organisation) for a potential interest in joining

         

        GPU/Heterogeneous computing: Following the discussions at the last LHCC, RRB, and the presentation by Antonio Perez at the March MB on the Job Allocation Task Force, we propose a plan towards an holistic approach to heterogeneous computing, with a clear focus on GPUs atm.

        The moment is quite interesting, since we have in the pipeline the WLCG Technical Roadmap, with a chapter dedicated to the subject.
        It needs to be fully clear upfront that the effort is a technical one, to make sure WLCG systems and procedures are in place to integrate heterogeneous resources. It is by no means a push to the experiment for their adoption, nor to the sites to provision GPUs. These decisions pertain to the experiments and sites, also through the C-RSG process. WLCG needs to be ready in case there is a convergence between these subjects, to effectively help towards the deployment (also at the level of pledges).

        A holistic approach is needed for this multifaceted problem. The main areas are at least:

          • Benchmarking: What is the HS23 equivalent of CPU-GPU systems? This is more complicated than benchmarking CPU because it will evolve over time as workflows make better use of the GPU.

          • Information System/Accounting: How is the information captured and reported? How dynamic is it? How detailed is it? This is more complex than with CPU.

          • Provisioning (WLCG facilities, Cloud, HPC, etc): How do the resources need to be presented in order for WLCG workloads to exploit?

          • Online and Offline Experiments’ Software and Workflows: How does (or can) software need to evolve? How can offline acquire information from the more advanced online world?

         

        The above areas would technically allow us to discuss the possible pledging of these type of resources (WLCG Resource Provisioning and Pledging Task Force): TCO and agreement on accepting these pledges from the experiments are a different dimension: ⇒ our role now is to be technically ready to be able to catch the opportunities/FAs decisions that might arise/we are forced to.
        Therefore, we propose:

        • To set up a light structure overseeing the full aspect. We envisioned a senior member – possibly a former computing coordinator – to serve as facilitator, especially in the interactions with the experiments (including the physics and online worlds). We selected Stefano Piano, former ALICE Computing Coordinator and currently involved in GPU-oriented development, as the perfect candidate.

        • Stefano would now join as co-editor of the heterogeneous chapter of the roadmap, joining Oxana, Ianna and Doug

        • After the WLCG roadmap has been reviewed and eventually (hopefully!) approved by the LHCC, the activity will become a permanent one throughout LS3.

      • Recent increase of (critical) vulnerabilities 10m
        Speakers: Jose Carlos Luna Duran (CERN), Maarten Litmaath (CERN)
    • 4:25 PM 4:30 PM
      Action List Review 5m
    • 4:30 PM 4:40 PM
      New LHCONE AUP 10m
      Speaker: Edoardo Martelli (CERN)

      New and possibly final version after MB suggestions and LHCONE discussion: https://twiki.cern.ch/twiki/bin/view/LHCONE/LhcOneAup-revised

      Highlights:

      - RC Site: a Resource Contributing site participating in and formally tied to one or more of a participating Collaborations
      - LHCONE Site: a RC site connected to the LHCONE L3VPN service;
      (replaced  "HEP site" with more generic "RC site")

      The LHCONE community accepts or rejects based on the impact on the LHCONE. Among criteria to be used in the evaluation:
          the collaboration MUST have an established relationship with WLCG 
      (replaced "must be related to Particle Physics" with " must have an established relationship with WLCG")

      Other computing resources not directly owned by a RC site can be connected to LHCONE, like:
         -  High Performance Computing (HPC):
              - are eligible to get part of LHCONE as long as the HPC system belongs to a LHCONE site or has an agreement with a RC site of one of the LHCONE collaborations
              - resources not yet connected to LHCONE have to follow the connection request procedure 
          - Cloud Resources:
              - cloud resources are only used for WLCG computing
              - their IP prefixes are dedicated to LHCONE purpose
              - their use is and their security provisions are under the responsibility of the LHCONE site admin and MUST be consistent with the present AUP and the WLCG standard
              - any other kind of cloud resources that doesn't fulfill these requirements has to be routed in the standard way (either general R&E IP or commodity internet) 
      (addition to "Eligibility for becoming a LHCONE Site")

       

       

    • 4:40 PM 4:50 PM
      WLCG Service Report 10m
      Speakers: Panos Paparrigopoulos (CERN), Maarten Litmaath (CERN)
    • 4:50 PM 5:00 PM
      TCB Report 10m
      Speakers: Alessandro Di Girolamo (CERN), James Letts (Univ. of California San Diego (US))

      Report to WLCG MB #340 on Tuesday, May 19, 2026 at 16:00 (CERN)

      Indico: https://indico.cern.ch/event/1595861/#2-tcb-report

      WLCG Open Technical Forum #11 (May 19th)

      The TCB is working on the presentation for CHEP26. There was a talk rehearsal for comment at OTF#11 today (Zoom only), just before this meeting. The presentation is a week from today so we can still collect comments.

      WLCG DOMA General Meeting (Apr 29th)

      Updated FTS4 timelines were presented in the CMS Offline and Computing Week Open Session [Indico] on March 19th and the April 29th DOMA General Meeting [Indico]. The user input freeze will happen in June and the first official release is planned in Q4 of 2026, in time for DC27 in February 2027. Deployment in production will take place first at CERN before DC27 and coordinated with other sites afterwards in tandem with the Alma10 rollout.

      WLCG Open Technical Forum #10 (Apr. 21st)

      OTF#10, with the theme of medium-to-long-term evolution of sites, was co-located with the HEPiX Spring Workshop in Lisbon, Portugal on Tuesday, April 21, 2026 [Indico]. Live notes are attached to the Indico agenda page.

      There were presentations and discussions on operational effort at sites, tape systems and performance evolution, storage capacity and throughput evolution, and a panel discussion on deploying services with k8s. 

      • A survey on operational effort is planned to be completed before the HEPiX Fall Workshop in October in Nebraska, since the last one was done in 2014. This survey will be coordinated by WLCG (Ops and TechCoord) and Pepe.

      • Understanding the storage I/O requirements of the experiments is important (e.g., for the WLCG Technical Roadmap) especially noting that capacity is increasing significantly quicker than throughput both for tape and disk. Flash storage may help performance in this respect, but there is no unified approach (caching, tiered storage, or standalone). 

      • K8s was seen as a useful way to orchestrate services but by no means the only way to do so and that it’s important to choose the “right tool for the job.” Advantages include the potential for sharing Helm charts and configurations, increasing the knowledge base of operations staff, possible room for distributed operations though recognizing that the learning curve is steep, which can challenge small site teams, and that a multilayered infrastructure complicates the debugging in case of issues.

       

      Two WLCG federations, U.S. CMS and IHEP (China) presented their facility evolution plans. 

      • U.S. CMS presented their plans to concentrate future storage purchases at the Fermilab Tier-1 and processing (including for analysis facilities) at the university Tier-2 sites, driven by datacenter limitations and simplification of sites to allow scaling up for HL-LHC with existing levels of effort. There were questions about Alpaka (performance portability layer) and analysis facility plans.

      • China will run university network paths though IHEP, taking advantage of the recently-upgraded LHCOPN/LHCONE connections at 100 Gbps to the international research networks. IHEP is also planning more vendor-heterogeneous (locally produced) CPU and GPU purchases to support ML/AI.

      Other Presentations (Apr. 22nd and May 13th)

      Also at the HEPiX Spring Workshop, O. Smirnova (Lund) presented on behalf of the TCB about the December 2025 WLCG Workshop on Heterogeneous Architectures [Indico] on April 22nd. 

      This presentation was repeated on May 13th at the EPIC Workshop on Heterogeneous Computing [Indico] by D. Benjamin (BNL), where there were questions and comments about the potential for collaboration across experiments or with governments to develop ML training infrastructure, about portability layers and vendor choice, and the cost evolution of GPUs.

       

    • 5:00 PM 5:05 PM
      AOB 5m
      • Next MB Meeting: Tuesday 16 June 2026 1m