US ATLAS Computing Facility
Facilities Team Google Drive Folder
Zoom information
Meeting ID: 993 2967 7148
Meeting password: 452400
Invite link: https://umich.zoom.us/j/99329677148?pwd=c29ObEdCak9wbFBWY2F2Rlo4cFJ6UT09
-
-
13:00
→
13:05
WBS 2.3 Facility Management News 5mSpeakers: Alexei Klimentov (Brookhaven National Laboratory (US)), Dr Shawn Mc Kee (University of Michigan (US))
WLCG Open Technical Forum (OTF) meeting #6 was the last two days https://indico.cern.ch/event/1562124/
ATLAS S&C is in ~2 weeks. Agenda is evolving https://indico.cern.ch/event/1509065/
ADC Coordination meeting this week postponed discussing walltime limit for ATLAS until next meeting (Sep 16), when Ivan should be able to attend.
Friday will be a discussion about the Trusted-CI engagement for those interested: https://www.google.com/url?q=https://umich.zoom.us/j/93713387827?pwd%3D0NbAN2tYXlRMHKxjbpJmv3jqgmwopu.1&sa=D&source=calendar&ust=1757937587257560&usg=AOvVaw2gFop8OdF05hMz2nE2_rmL 9-10 AM Eastern
-
13:05
→
13:10
OSG-LHC 5mSpeakers: Brian Hua Lin (University of Wisconsin), Matyas Selmeci
-
13:10
→
13:30
WBS 2.3.1: Tier1 CenterConvener: Alexei Klimentov (Brookhaven National Laboratory (US))
-
13:10
Tier-1 Infrastructure 5mSpeaker: Jason Smith
- 13:15
-
13:20
Storage 5mSpeakers: Carlos Fernando Gamboa (Department of Physics-Brookhaven National Laboratory (BNL)-Unkno), Carlos Fernando Gamboa (Brookhaven National Laboratory (US))
-
13:25
Tier1 Operations and Monitoring 5mSpeaker: Ivan Glushkov (Brookhaven National Laboratory (US))
-
13:10
-
13:30
→
13:40
WBS 2.3.2 Tier2 Centers
Updates on US Tier-2 centers
Conveners: Fred Luehring (Indiana University (US)), Rafael Coelho Lopes De Sa (University of Massachusetts (US))- Great running recently...
- Only item of note was a planned downtime at NET2 for an OKD update.
- Various minor issues:
- cvmfs problems at AGLT2
- Jobs hitting the wall time limit
- Setup MWT2 to allow Paul to test a new pilot version that enables the sub-cgroup memory limit.
- Need to mail Paul to start his testing.
- Progressing at CPB on migrating data to new servers and getting storage updated to EL9.
- Future XRootD updates require EL9.
- Please do not buy any equipment until we have guidance from management.
- Great running recently...
-
13:40
→
13:50
WBS 2.3.3 Heterogenous Integration and Operations
HIOPS
Convener: Rui Wang (Argonne National Laboratory (US))-
13:40
HPC Operations 5mSpeaker: Rui Wang (Argonne National Laboratory (US))
-
13:45
Integration of Complex Workflows on Heterogeneous Resources 5mSpeakers: Doug Benjamin (Brookhaven National Laboratory (US)), Xin Zhao (Brookhaven National Laboratory (US))
-
13:40
-
13:50
→
14:10
WBS 2.3.4 Analysis FacilitiesConveners: Ofer Rind (Brookhaven National Laboratory), Wei Yang (SLAC National Accelerator Laboratory (US))
- AF Debrief/Planning meeting with Lincoln last week (notes)
- Updates to AF Docs (need to sort out GitHub roles/permissions)
- At last week's 2.3/5 meeting: discussion of AI tools (Shuwei), GPU metrics and Heavy-Ion storage requests at BNL
-
13:50
Analysis Facilities - BNL 5mSpeaker: Qiulan Huang (Brookhaven National Laboratory (US))
-
13:55
Analysis Facilities - SLAC 5mSpeaker: Wei Yang (SLAC National Accelerator Laboratory (US))
-
14:00
Analysis Facilities - Chicago 5mSpeaker: Fengping Hu (University of Chicago (US))
-
14:10
→
14:25
WBS 2.3.5 Continuous OperationsConvener: Ofer Rind (Brookhaven National Laboratory)
-
14:10
ADC Operations, US Cloud Operations: Site Issues, Tickets & ADC Ops News 5mSpeaker: Ivan Glushkov (Brookhaven National Laboratory (US))
- WLCG OTF #6 meeting this week
- Day 1: Environmental Sustainability; Day 2: Network Updates & Challenges
- Ongoing work to configure external Varnish service for BNL
- Numerous issues with network routing and configuration of servers at NET2
- Ilija has some measurement of performance effect due to large Frontier cache latency
- TW-FTT is sending an engineer to CERN in October/November to help integrate ASGC into the US Cloud support team
- WLCG Ops Meeting last week: updates on CRIC and HC status, AVX2 policy
- WLCG OTF #6 meeting this week
-
14:15
Services DevOps 5mSpeaker: Ilija Vukotic (University of Chicago (US))
XCache
- ESnet will be ugpraded
- OX xcaches issues fixed by a manual restarts
- new xcache service certificate made and given to UK users
- Raphael (Wuppertal) thinking of testing http caching
AI
- Deployed OpenWebUI as a frontend for AF assistant
- Setting up all the functionality will be quite involved
Varnish
- We have a backup Frontier cluster and service up and running
- US Varnishes (except SWT2_CPB) updated to see backup too.
- NET2 now using their own Varnish
- BNL uses NRP node in mghpcc (very large latency, performance 1/2 of local squid). Trying to get it to connect to NET2 varnish
- working with Asoka and Chris on changes to Tier3 and lxplus settings.
- Waiting on Ryu to see how to change setting at TACC.
- starting rebuilding CREST
- Changes made to CVMFS varnishes to set larger nuke number. Working on it with Wenjing. Initial results very promissing.
-
14:20
Facility R&D 5mSpeaker: Lincoln Bryant (University of Chicago (US))
- Facility R&D Biweekly last week (minutes)
-
14:10
-
14:25
→
14:35
AOB 10m
-
13:00
→
13:05
