Operations team & Sites

Europe/London
EVO - GridPP Operations team meeting

EVO - GridPP Operations team meeting

Description
- This is the biweekly ops & sites meeting - The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the GridPP Community area. - The phone bridge number is +44 (0)161 306 6802 (CERN number +41 22 76 71400). The phone bridge ID is 78425 with code: 4880. Apologies: Alessandra, Andrew, Elena
Minutes
    • 1
      Meetings & updates
      - ROD team update -- Rota https://www.gridpp.ac.uk/wiki/ROD_rota needs to be updated soon. (Action on Jeremy) - Nagios status - Tier-1 update - Security update -- T2 issues -- General notes. - Tickets Direct link: http://tinyurl.com/3jjnvca if not working Indirect link: https://ggus.eu/ws/ticket_search.php (select support unit 'ROC_UK/Ireland' and Creation Date 'Any') or paste https://ggus.eu/ws/ticket_info.php?ticket= and type a ticket number for the URL. 73365 - Oxford. H1. Jobs aborted after submission via t2ce02. Waiting for new CREAM release. Should be 'on-hold'? 73280 - Brunel. Biomed - reopened SE issue (from Nagios tests). 73187 - UCL-HEP. ATLAS. jobs fail on some WNs with pilot: Get error: Copy command self timed out after 3605 s. Queue suspended for now. 72903 Region. Configure Nagios for NGI. Progress? 72359 RAL myproxy for T2K. Cross-ref 72358. 72358 T2K myproxy... escalated to FTS developers. Long discussion on documentation and usage. Ticket now on hold pending FTS developers making a change. 72161: IC-HEP. T2K. 3TB spacetoken created. Waiting for user to test. 72160: Oxford. T2K spacetoken. Ewan is this waiting for a user response? 72156: QMUL. T2K spacetoken. Waiting for user since 2nd August. 68865: UCL-HEP. Retirement of SL4 and 32bit DPM Head nodes and Servers. On hold 68859: Durham. Retirement of SL4 and 32bit DPM Head nodes and Servers. On hold 68858: Glasgow. Retirement of SL4 and 32bit DPM Head nodes and Servers. On hold 68853: RAL T1. Master ticket. Brian reviewing recommended versions. 68077: RAL T1: Mandatory WLCG InstalledOnlineCapacity not published. Expect test version this month. 64995: RAL T1: No GlueSACapability defined for WLCG Storage Areas. should have something you can test this month (August.) 57746: Cambridge. Karl has tested again (in August) and still sees problems!
    • 2
      Experiment problems/issues
      Review of weekly issues by experiment/VO - LHCb Low activity at Tier-2 sites At RAL : 3 lost files due to a bad tape and VO_LHCB_SW_DIR problems continued till last Friday. - CMS - ATLAS - Other - Experiment blacklisted sites - Experiment known events affecting job slot requirements - Site performance/accounting issues -- No accounting problems noted - Metrics review http://pprc.qmul.ac.uk/~lloyd/gridpp/hs06.html
      ATLAS report
    • 3
      Discussion
      - Current problems & issues - Status of EMI-1/UMD-1 testing
    • 4
      Actions
      - http://www.gridpp.ac.uk/wiki/Deployment_Team_Action_items
    • 5
      AOB
      Reminder of the Tier-1 liaison meeting: http://www.gridpp.ac.uk/wiki/RAL_Tier1_Experiments_Liaison_Meeting.