US ATLAS Computing Integration and Operations
-
-
1
Top of the MeetingSpeakers: Eric Christian Lancon (BNL), Robert William Gardner Jr (University of Chicago (US))
-
2
ADC news and issuesSpeakers: Robert Ball (University of Michigan (US)), Wei Yang (SLAC National Accelerator Laboratory (US))
bigpanda will shortly transform to https access, from http. Typically then the CERN SSO will be used to allow access, but JSON can can still be scraped via http. This transition should take place within the next 2 weeks. For further details see:
https://indico.cern.ch/event/642827/contributions/2608310/attachments/1490643/2317018/httpS_for_bigpanda_monitoring.pdfWei is leading an effort to deploy singularity usage in the US cloud. This is a voluntary effort, where the underlying WN OS should be centos7. For issues and procedures information, see:
https://twiki.cern.ch/twiki/bin/view/AtlasComputing/ContainersInUScloudFrom Andrej Filipcic, a brief summary:
- pilotcode supporting singularity is now in production, we start testing targeted sites (RAL, Manchester, ...) with it - singularity could also be started in the wrapper, but since we have the pilotcode ready, we try to use that - for now we continue to use catchall, when we get more experience with site specifics, we think about what should be moved to site configuration (singularity.conf), what in AGIS, and if we can simplify things like using scratchdisk by relocating the bind mounts - by September we should test most of T1s, some big T2s, so we have some input for the containers task force - testing the containers should be done in a similar way as it was done with the new mover - we follow up with hammercloud team to implement singularity HC testing - for performance reasons, we should migrate to unpacked chroot. The img and the dir should be kept in sync, we will need both (eg img for HPC) - we should concentrate on centos7 sites. later on we should also test the centos7 images. - at pre-gdb, there was a discussion whether to go with non-suid singularity deployment. We also need to evaluate if this is feasible for ATLAS or not. Some sites might want to use it in the future. (it's not even available at this point in centos7, maybe with RH7.4)
-
3
ProductionSpeaker: Mark Sosebee (University of Texas at Arlington (US))
-
4
Data ManagementSpeaker: Armen Vartapetian (University of Texas at Arlington (US))
-
5
Data transfersSpeaker: Hironori Ito (Brookhaven National Laboratory (US))
-
6
ContainersSpeaker: Wei Yang (SLAC National Accelerator Laboratory (US))
-
7
NetworksSpeaker: Dr Shawn McKee (University of Michigan ATLAS Group)
-
8
FAX and Xrootd CachingSpeakers: Andrew Bohdan Hanushevsky (SLAC National Accelerator Laboratory (US)), Andrew Hanushevsky, Andrew Hanushevsky (STANFORD LINEAR ACCELERATOR CENTER), Ilija Vukotic (University of Chicago (US)), Wei Yang (SLAC National Accelerator Laboratory (US))
-
9
HPCs integrationSpeaker: Taylor Childers (Argonne National Laboratory (US))
-
Site Reports
- 10
-
11
AGLT2Speakers: Robert Ball (University of Michigan (US)), Dr Shawn McKee (University of Michigan ATLAS Group)
-
12
MWT2Speakers: David Lesny (Univ. Illinois at Urbana-Champaign (US)), Lincoln Bryant (University of Chicago (US))
-
13
NET2Speaker: Prof. Saul Youssef (Boston University (US))
Issues:
1) Some old CAs fail to authenticate to Bestman. There is a fix from OSG (updated JGlobus) that we volunteered to test.
2) Harvard was down for a day to migrate their puppet infrastructure.
3) Downtime for 2) failed to propagate to AGIS for some reason. OSG guys are looking into it.
4) Lots of NESE activity.
5) GPFS client issue needs to be resolved before we can go to RH7 (& singularity).
6) Going to 6 hour reporting for space token sizes for DDM deletion issue that Armen noticed.
7) Smooth running otherwise.
- 14
-
15
SWT2-UTASpeaker: Patrick Mcguigan (University of Texas at Arlington (US))
-
16
WT2Speaker: Wei Yang (SLAC National Accelerator Laboratory (US))
-
17
AOB
-
1