Operations team & Sites

Europe/London
EVO - GridPP Operations team meeting

EVO - GridPP Operations team meeting

Description
- This is the biweekly ops & sites meeting - The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the GridPP Community area. - The phone bridge number is +44 (0)161 306 6802 (CERN number +41 22 76 71400). The phone bridge ID is 115728 with code: 4880. Apologies: Pete
    • 11:00 11:20
      Meetings & updates 20m
      - ROD team update Performance ticket: https://ggus.eu/ws/ticket_info.php?ticket=76103 - EGI ops CERN Top BDII Survey: • Which sites are using the lcg-bdii.cern.ch as main top-BDII configured at client side (and possibly why they have chosen to use it)? • Which sites are using the lcg-bdii.cern.ch as backup top-BDII, when the first one is not available? Upgrade strategies and EMI release uptake - Nagios status - Tier-1 update - Security update -- T2 issues The geant4 VOMS information in https://www.gridpp.ac.uk/wiki/GridPP_approved_VOs is/was out of date. The introduction of lsc files meant that "all the CERN VOs need both voms.cern.ch and lcg-voms.cern.ch" in the site configuration. -- General notes. - First allocations from the GridPP accounting period will be circulated shortly. It is noted (from site comments) that some sites have small amounts of under reporting in their APEL data but unless there are any major issues the current figures will be used as of midday. APEL accounting data is only used to divide the LHCb/other contribution which is significantly smaller than ATLAS and CMS. - The CB received a report last week about extra funding available for network infrastructure. Talk to your CB member if you are not aware of the GridPP presentation last week on the topic. - TEG ops group request to check on this ticket and comment. It concerns error messages: https://ggus.eu/tech/ticket_show.php?ticket=74911 - Stephen Burke will attend an info system meeting on December 1st: https://www.egi.eu/indico/conferenceDisplay.py?confId=654. Please let him know if you have any input/comments. - There is a GDB tomorrow: https://indico.cern.ch/conferenceDisplay.py?confId=106650. Steve Jones is our Tier-2 rep this time. Topics being covered are: - Workload management - Data management - Operational tools (Note there is still a request for more sysadmins to provide input on these areas https://twiki.cern.ch/twiki/bin/view/LCG/WLCGTEGOperations). - Accounting - EMI - gLite - CREAM/SGE support - HEPiX summaries (Meeting; Benchmarking; Virtualisation) - Checking tickets for NGI_UK: http://tinyurl.com/6etw8gm Or go to https://ggus.eu/ws/ticket_search.php and select Support Unit:NGI_UK and Creation date: Any and Status: open states - then click Go. To look at.... Green: https://ggus.eu/ws/ticket_info.php?ticket=76023 (wrongly assigned to PPD?) https://ggus.eu/ws/ticket_info.php?ticket=75957 (Birmingham) Red: https://ggus.eu/ws/ticket_info.php?ticket=75538 (Cambridge) https://ggus.eu/ws/ticket_info.php?ticket=75488 (Durham) https://ggus.eu/ws/ticket_info.php?ticket=74887 (Cambridge) - still publishing biomed
    • 11:20 11:40
      Experiment problems/issues 20m
      Review of weekly issues by experiment/VO - LHCb - CMS - ATLAS - Other - Experiment blacklisted sites - Experiment known events affecting job slot requirements - Site performance/accounting issues - Metrics review
    • 11:40 11:50
      PhenoGrid issues 10m
      - PhenoGrid have reported dissatisfaction with job efficiencies. One large problem relates to an inability to renew proxies on the RAL WMS: https://ggus.eu/ws/ticket_info.php?ticket=74353. - A list of other failure messages for this meeting to look at is attached to the agenda.
      Job failures
    • 11:50 11:55
      Up coming UK meetings 5m
      - HEPSYSMAN (Thursday 10th) The agenda is here: https://indico.cern.ch/conferenceDisplay.py?confId=157131. As discussed last week the idea is to allow more time for discussion on certain themes. - Core ops tasks (Friday 11th) The meeting will spend about 30 minutes on each core area https://www.gridpp.ac.uk/wiki/Category:GridPP_Operations with the intention of having well defined tasks in each area for which core team members take responsibility. Even those not part of the core team are invited to comment on where we need to improve in each of these areas: - Staged rollout - On-duty - Ticket follow up - Regional tools - Documentation - Security - Monitoring - Accounting - Core services - Wider VO issues - Grid interoperation - Overall strategy
    • 11:55 12:00
      Actions 5m
      - https://www.gridpp.ac.uk/wiki/Operations_Team_Action_items
    • 12:00 12:01
      AOB 1m