US ATLAS Computing Facility

US/Eastern
Description

Facilities Team Google Drive Folder

Zoom information

Meeting ID:  993 2967 7148

Meeting password: 452400

Invite link:  https://umich.zoom.us/j/99329677148?pwd=c29ObEdCak9wbFBWY2F2Rlo4cFJ6UT09

 

 

    • 1
      WBS 2.3 Facility Management News
      Speakers: Alexei Klimentov (Brookhaven National Laboratory (US)), Dr Shawn Mc Kee (University of Michigan (US))
    • 2
      OSG-LHC
      Speakers: Brian Hua Lin (University of Wisconsin), Matyas Selmeci

      Release (this week)

      • OSG 25!
      • osg-vo-client with LSST VO updates
    • WBS 2.3.1: Tier1 Center
      Convener: Alexei Klimentov (Brookhaven National Laboratory (US))
      • 3
        Tier-1 Infrastructure
        Speaker: Jason Smith
      • 4
        Compute Farm
        Speaker: Thomas Smith

        Issues with PDUs at the BNL Data Center, it looks that it wasn't foreseen in time the data center was designed

        [only CPU racks are affected, they by design have 1 PDU per row]

         

        BNL F&O is checking all PDUs, we are working on possible work around

        We are still looking good pledge-wise, BNL T1 delivers its 2025 pledge, right now we are ~3% below the pledge.  

        [we want to avoid speculations that 1/3 of US ATLAS facilities at BNL is down]

        More info later today and in one week after scheduled F&O intervention to inspect all PDUs

      • 5
        Storage
        Speakers: Carlos Fernando Gamboa (Brookhaven National Laboratory (US)), Carlos Fernando Gamboa (Department of Physics-Brookhaven National Laboratory (BNL)-Unkno)

        dCache and HPSS systems are operating smoothly.

        HPSS data repacking activities are in progress, migrating from LTO-6 to LTO-8 tape drives

      • 6
        Tier1 Operations and Monitoring
        Speaker: Ivan Glushkov (Brookhaven National Laboratory (US))
    • WBS 2.3.2 Tier2 Centers

      Updates on US Tier-2 centers

      Conveners: Fred Luehring (Indiana University (US)), Rafael Coelho Lopes De Sa (University of Massachusetts (US))
    • WBS 2.3.3 Heterogenous Integration and Operations

      HIOPS

      Convener: Rui Wang (Argonne National Laboratory (US))
      • 7
        HPC Operations
        Speaker: Rui Wang (Argonne National Laboratory (US))
        • Perlmutter: 10%/33% CPU/GPU allocation remains. Stable
          • GPU usage is low
        • TACC: UCORE queue is back online yesterday.
          • rebuild the CVMFSexec to the latest version
          • Increasing the job throughput to speed up allocation usage (request by Doug for dCache test)
      • 8
        Integration of Complex Workflows on Heterogeneous Resources
        Speakers: Doug Benjamin (Brookhaven National Laboratory (US)), Xin Zhao (Brookhaven National Laboratory (US))
    • WBS 2.3.4 Analysis Facilities
      Conveners: Ofer Rind (Brookhaven National Laboratory), Wei Yang (SLAC National Accelerator Laboratory (US))
      • 9
        Analysis Facilities - BNL
        Speaker: Qiulan Huang (Brookhaven National Laboratory (US))
        • Doug, Louis are deploying a new, unified, federated jupyterhub frontend serving US ATLAS, FCC, and Dune users.
      • 10
        Analysis Facilities - SLAC
        Speaker: Wei Yang (SLAC National Accelerator Laboratory (US))
      • 11
        Analysis Facilities - Chicago
        Speaker: Fengping Hu (University of Chicago (US))
    • WBS 2.3.5 Continuous Operations
      Convener: Ofer Rind (Brookhaven National Laboratory)
      • 12
        ADC Operations, US Cloud Operations: Site Issues, Tickets & ADC Ops News
        Speaker: Ivan Glushkov (Brookhaven National Laboratory (US))
        • New pilot released, includes updated cgroups subprocess memlimits (tested at MWT2)
          • Some paths needed for obtaining GPU info for prmon are missing (i.e. /usr/sbin was added, but not /usr/bin?) (Jira)
      • 13
        Services DevOps
        Speaker: Ilija Vukotic (University of Chicago (US))
      • 14
        Facility R&D
        Speaker: Lincoln Bryant (University of Chicago (US))
    • 15
      AOB