US ATLAS Computing Facility (Possible Topical)

US/Eastern
Description

Facilities Team Google Drive Folder

Zoom information

Meeting ID:  993 2967 7148

Meeting password: 452400

Invite link:  https://umich.zoom.us/j/99329677148?pwd=c29ObEdCak9wbFBWY2F2Rlo4cFJ6UT09

 

 

    • 13:00 13:05
      WBS 2.3 Facility Management News 5m
      Speakers: Alexei Klimentov (Brookhaven National Laboratory (US)), Dr Shawn Mc Kee (University of Michigan (US))
    • 13:05 13:10
      OSG-LHC 5m
      Speakers: Brian Hua Lin (University of Wisconsin), Matyas Selmeci
    • 13:10 13:30
      WBS 2.3.1: Tier1 Center
      Convener: Alexei Klimentov (Brookhaven National Laboratory (US))
      • 13:10
        Tier-1 Infrastructure 5m
        Speaker: Jason Smith
      • 13:15
        Compute Farm 5m
        Speaker: Thomas Smith

        Operations have been very smooth

      • 13:20
        Storage 5m
        Speakers: Carlos Fernando Gamboa (Department of Physics-Brookhaven National Laboratory (BNL)-Unkno), Carlos Fernando Gamboa (Brookhaven National Laboratory (US))
      • 13:25
        Tier1 Operations and Monitoring 5m
        Speaker: Ofer Rind (Brookhaven National Laboratory)
        • Smooth operations
        • Meeting with ESNET tomorrow discussing HL-LHC requirements, Data Challenges and Networking
    • 13:30 13:40
      WBS 2.3.2 Tier2 Centers

      Updates on US Tier-2 centers

      Conveners: Fred Luehring (Indiana University (US)), Rafael Coelho Lopes De Sa (University of Massachusetts (US))
      • Good running during the last months.
        • No significant outages.
      • Our focus is on procurement with the next meeting being at noon EST this Friday, February 27.
      • The older storage at CPB is being updated to EL9 at a good pace.
      • OU has updated to OSG25.
      • Still waiting for dCache version 11.2.1 to release.
    • 13:40 13:50
      WBS 2.3.3 Heterogenous Integration and Operations

      HIOPS

      Convener: Rui Wang (Argonne National Laboratory (US))
      • 13:40
        HPC Operations 5m
        Speaker: Rui Wang (Argonne National Laboratory (US))
      • 13:45
        Integration of Complex Workflows on Heterogeneous Resources 5m
        Speaker: Doug Benjamin (Brookhaven National Laboratory (US))
    • 13:50 14:10
      WBS 2.3.4 Analysis Facilities
      Convener: Wei Yang (SLAC National Accelerator Laboratory (US))
      • 13:50
        Analysis Facilities - BNL 5m
        Speaker: Qiulan Huang (Brookhaven National Laboratory (US))
        • User space cleanup: Viviana provided a start storage policy to refer

          • Prepared a python code and testing about email notification to inactive users automatically regarding to the inactive account policy

        • Ofer, Rob, Tom working with Giordon to set up new AF benchmarking monitor on BNL OpenShift
      • 13:55
        Analysis Facilities - SLAC 5m
        Speaker: Wei Yang (SLAC National Accelerator Laboratory (US))
      • 14:00
        Analysis Facilities - Chicago 5m
        Speaker: Fengping Hu (University of Chicago (US))

        Jupyter Notebook Services Updates

        • Image Rationalization: Consolidated and reorganized notebook images, reducing ml_platform variants (e.g., conda, Julia) to simplify the user experience and streamline maintenance.

        • Unified Monitoring Framework: Launched three interlinked dashboards covering JupyterLab, Coffea-Casa, and BinderHub services.

        • Cluster-Level Visibility: High-level view of server health, resource allocation trends, and GPU utilization across environments.

        • User Analytics: Per-user usage metrics to identify heavy usage patterns and support capacity planning.

        • Infrastructure Efficiency: Pod-level observability to optimize resource allocation and improve overall service efficiency.

    • 14:10 14:30
      WBS 2.3.5 Continuous Operations
      Conveners: Ivan Glushkov (Brookhaven National Laboratory (US)), Ofer Rind (Brookhaven National Laboratory)
      • T3 LOCALGROUPDISKs - we do not take care of them but users are using them (and failing, Nevis).
        • The particular problem was solved (cert problem)
      • Restarting FTS4 tests now.
        • Load tests with FT transfers in the beginning of March.
        • Manchester will be the first production side to switch to FTS4
      • Stopping IPv4
        • on LHCONE for AGLT2 on March 10
        • on LHCOPN for PIC on February 23
      • 14:10
        ADC Operations, US Cloud Operations: Site Issues, Tickets & ADC Ops News 5m
        Speaker: Kaushik De (University of Texas at Arlington (US))
        • Very nice presentation of US mini-DC results at WLCG DOMA General earlier today (link)

        • AI email report for SWT2 from Kaushik.
      • 14:15
        Services DevOps 5m
        Speaker: Ilija Vukotic (University of Chicago (US))
      • 14:20
        Facility R&D 5m
        Speaker: Robert William Gardner Jr (University of Chicago (US))
        • Facility R&D Biweekly (notes):  updates on RP1, SENSE, HTCondor-related development,...

      • 14:25
        Cybersecurity plan(s) 5m
        Speakers: Robert William Gardner Jr (University of Chicago (US)), Shigeki Misawa (Brookhaven National Laboratory (US))
    • 14:30 14:40
      AOB 10m