RAL Tier1 Experiments Liaison Meeting
→
Europe/London
Access Grid (RAL R89)
Access Grid
RAL R89
-
-
13:30
Experiment Operational Issues
-
1
ATLAS Operations ReportSpeakers: Brij Kishor Jashal (Rutherford appelton laboratory), Jyoti Prakash Biswal (Rutherford Appleton Laboratory)
-
2
CMS Operations ReportSpeaker: Katy Ellis (Science and Technology Facilities Council STFC (GB))
Katy may or may not make the meeting as she is in CMS Computing Management 3-day meeting.
Red day for SAM on Monday due to another network issue lasting a few hours.
Otherwise a very good week for CMS at T1 - many running cores at end of last week and since the Monday network issues; highest CPU efficiency among all T1s.
A few residual failures on Echo->Antares link since Monday network issues to investigate - may still be cleaned up automatically in time. Otherwise performance of transfers looks good. Wrote over 700TB to tape this week.
-
3
LHCb Operations ReportSpeaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
Operational issues:
- Network outage on Monday (GGUS:683544)
- Site-wide network outage, affected LHCb as well (obviously)
- Fixed on Monday evening
- Is the reason behind this known?
- New Echo storage node inaccessible from 201[89] gen WNs (GGUS:683524)
- Caused local gateways to get stuck and consequently all downloads from ECHO fail
- Fixed now, network setup was changed to allow WNs to talk to the SN.
- Redirector issues this morning
- Wrong DNS alias?
CVMFS:
- (monitoring) issues with squid0[56] are still present
- The machines are correctly resolvable from RAL, but not outside RAL (e.g. lxplus)
- Added to internal DNS, but not external one?
- FAB-1101 is tracking the issue
- The machines are correctly resolvable from RAL, but not outside RAL (e.g. lxplus)
- Network outage on Monday (GGUS:683544)
-
4
ALICE Operations ReportSpeaker: Alexander Rogovskiy (Rutherford Appleton Laboratory)
-
5
LSST Operations ReportSpeakers: Mathew Sims, Timothy Noble (Science and Technology Facilities Council STFC (GB))
- RC2 now ingested and running jobs
- some hiccups, but seems to be there now
- still issues with Step 3 requesting not enough Memory, but that has not been fixed in the pipeline and requires alterations on a run basis - once done, runs as expected
- Network blips seem to have extended run time of jobs, but jobs still succeeding
- IngetstD v20 now deployed
- Nalin got workable demo deployments of IngestD / LSST monitoring stack and will soon deploying that for LSST monitoring, and then be looking at more in-depth job monitoring
- RC2 now ingested and running jobs
-
14:00
Tier-1 Projects
-
6
Anatares Upgrade
New EOS nodes
Repack ProgressSpeakers: George Patargias, Thomas Byrne -
7
XRootD DevelopmentSpeakers: Alexander Rogovskiy (Rutherford Appleton Laboratory), Jyothish Thomas (STFC)
-
8
Utilizing GPUsSpeakers: Jyoti Prakash Biswal (Rutherford Appleton Laboratory), Thomas Birkett
-
14:45
AOB
-
9
Summary of Operational Status and IssuesSpeakers: Brian Davies (Lancaster University (GB)), Darren Moore
-
10
Any other BusinessSpeakers: Brian Davies (Lancaster University (GB)), Darren Moore
-
13:30