- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
Today is the OSG area coordinators meeting focusing on networking. All are welcome to join: https://opensciencegrid.github.io/management/area-coordinators/
We are getting the word out on two items related to perfSONAR:
The next HEPiX NFV working group meeting is coming up April 25: https://indico.cern.ch/event/715631/
C6420 chassis and R6420 sleds (56 HT Cores) have been delivered to UM and MSU. 7 sleds are currently on-line and running HTCondor jobs. 12 additional sleds should be configured by COB Friday. The balance await the installation of a 10Gb switch at MSU.
These C6420 are current hogs. See the attached plots.
The UM site has a coolant leak, that will be addressed tomorrow. Unfortunately all cooling must be shut down to find the leak, so today we began to idle down the WN to power them off, minimizing the room head load. Service VMs and machines, along with dCache storage will remain online, hopefully within the cooling available from a portable unit that will be put in place during the repairs. These repairs are expected to be completed by COB tomorrow.
Approximately 2/3 of our WN total are now running SL7. The last 1/3 will transition more slowly as operation of the muon-calibration center needs to be carefully transitioned.
Our space reporting via ruby script transitioned to SL7 on Friday; and broke as ruby activerecord version has jumped by 2 major versions, from 2.3.18 to 4.2.6 . This was fixed by Shawn over the weekend, and our space reporting is once again up to date. The Wiki page below has been updated with the changed code.
https://twiki.cern.ch/twiki/bin/view/AtlasComputing/DcacheSpaceReportingJsonViaRuby
The xrootd problem reported a few days ago and summarized at this URL
https://github.com/dCache/dcache/pull/3562
has not been observed here to the best of my knowledge. We are running dCache 4.0.5 with a mix of xrootd rpms, depending on whether the WN is SL6(4.6.1) or SL7(4.8.1).
AGLT2 now has a full suite of SL7 PandaQueues in operation. SL6 PQ are also still enabled, and will remain so until the last SL6 WN is retired.
To quote Xin, " overlay option disabled in singularity on all WNs, in light of the reported vulnerability "
MWT2 Update
Stampede2 Update
NET3 "Northeast Tier 3" successfully launched. 38 users from BU, Harvard, and UMASS/Amherst.
BU+MIT+ESnet working on re-establishing our LHCONE peering. The current theory is that a bad card at MANLAN is causing a problem. Have been jumping on and off LHCONE in the past week for testing.
Working on finishing off LCMAPs migration (with Brian Lin helping), turning off GRAM at BU and migrating from Bestman to Wei's Gridftp with Adler32 callout.
Lots of NESE activity... The NESE first deployment ordered. 10.8 PB. First major test will be ATLAS storage endpoint.
Production smooth with a noticeable switch over from mcore domination to production domination.
Start planning for RH7 migration.
Annual MGHPCC maintenance down day is May 22.
- all sites running well
- Lucille is having A/C issues, though, so running at reduced capacity
- still waiting for RUCIO server update in order to take advantage of local xrootd redirector for jobs running on OSCER
- singularity tests currently on hold because we can't run without overlay turned on
UTA_SWT2:
SWT2_CPB:
General: