- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
Topics
BrianL out starting Sept 21, returning Oct 15. Mátyás Selmeci will be attending the facilities meetings
OSG 3.4.18
XRootD Overhaul
OSG Topology (formerly OIM)
To follow-up with the cleanup of the leftover dark data at BNL: ~320TB at DATADISK and ~100TB at SCRATCHDISK
Follow up discussions about the next DDM dashboard during the last monitoring and TCB meetings. After the Aug.3 dedicated monitoring meeting developers are working on the new framework. Already significant changes in the interface to address all the suggestions.
Raised the question of the missing data in the DDM Accounting dashboard during the last monitoring meeting. I have a SNOW ticket opened a while ago on that. The person who was fixing the issues has left. Also raised a question that the new monitoring page, to replace the current one, basically is not functional. We agreed to have a dedicated discussion on that too.
Ongoing analysis of US Tier-2 LHCONE network use being explored with ESnet, comparing/contrasting the ESnet metrics with the ATLAS and CMS numbers. Today is a follow-up meeting to cover ATLAS numbers. See spreadsheet at https://docs.google.com/spreadsheets/d/1zCdr-9avH-aDtXDTNGli1HZ245LETJud6amDn4S_Azg/edit#gid=895412619
The perfSONAR v4.1.1 update is out. Fixes initial issues with 4.1.
The OSG/WLCG "meshconfig" (now "pSConfig") GUI running at AGLT2 MSU has some IPv6 connectivity issues. Some perfSONAR instances that are dual-stacked and NOT on LHCONE don't have connectivity to the psconfig.opensciencegrid.org host. Working with MSU networking to see about what is wrong and how to get it fixed.
ML platform front-end developments:
Analytics service jobs:
XCache simulations:
News: Wenjing Wu just joined us yesterday (Sep 11) and will be taking over much of Bob Ball's work at AGLT2_UM once he retires in November. Wenjing will join the USATLAS mailing list.
We have been seeing problems with CVMFS and have found some parts of our check_mk monitoring that was contributing to the problem. We created a new RPM, tested overnight and are deploying it to all our worker nodes today. May not have completely fixed the issue but certainly helped given the limited statistics from running since yesterday on a subset of nodes.
There is a problem routing IPv6 to MSU for non LHCONE sites. Being looking into by MSU and MERIT networking folks and we hope to have a resolution soon.
Looking to coordinate buys for FY18
Power maintenance Sept. 25, will absorb part of HU equipment to the BU pods.
Plan to turn off Bestman on Sept. 25, go to Gridftp only.
NESE hardware at MGHPCC, 1/2 cabled, upgrading NET2<->NESE networking path to multi 100Gb/s.
On the agenda:
0. Orders for remaining FY18 hardware.
1. Complete absorption & retirement of HU_ queues.
2. Networking upgrade.
3. RH7 upgrade + do something about GPFS client.
4. Plan IPv6 for NESE gateways. Test NESE as ATLAS storage endpoint.
UTA Sites:
The HEPSPEC06 Normalization factor used by APEL/WLCG for both UTA_SWT2 and SWT2_CPB are significantly wrong. It is correct in OIM and AGIS. We have a ticket open with GGUS to rectify the problem.
Change is being made in campus network peering with LEARN for Science DMZ. Previously LHCOne traffic was carried by UT-OTS network to a peering site with LEARN. Will now peer directly with LEARN on-campus.
SWT2_CPB:
UTA_SWT2:
OU:
- OU_OSCER_ATLAS T2/T3 issue being worked on, WLCG ticket open
- xrootd TPC testbed working on OU_OSCER_ATLAS_SE, working on enabling dteam VO; OSG ticket open