Operations team & Sites

Europe/London
EVO - GridPP Operations team meeting

EVO - GridPP Operations team meeting

Description
- This is the biweekly ops & sites meeting - The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the GridPP Community area. - The phone bridge number is +44 (0)161 306 6802 (CERN number +41 22 76 71400). The phone bridge ID is 115728 with code: 4880. Apologies:
Minutes
    • 11:00 11:20
      UK NGI - monthly discussion 20m
      To improve NGI operations (currently GridPP and NGS) the second ops meeting each month will feature an NGI focus. This month the sub-agenda is as follows. 1) Introduction and aims 2) Helpdesk workflows. This month, use of 'in progress' for ticket status. 3) NGI availability of services (concept of virtual sites bringing together services). Suggested topics for coming months: - Nagios monitoring and NGI use-cases for additional probes
      Slides
    • 11:20 11:40
      Experiment problems/issues 20m
      Review of weekly issues by experiment/VO - LHCb - CMS - ATLAS - Other - Experiment blacklisted sites - Experiment known events affecting job slot requirements - Site performance/accounting issues - Metrics review
    • 11:40 12:00
      Meetings & updates 20m
      - ROD team update - EGI ops - Nagios status - Tier-1 update - Security update -- T2 issues November availability -- General notes. - GDB tomorrow starts at 12:00 UK time: http://indico.cern.ch/conferenceDisplay.py?confId=155064. It will use Vidyo. GDB Tier-2 reps for this year need to be agreed shortly: https://www.gridpp.ac.uk/wiki/GDB_reports - Who has an oral or poster presentation accepted for CHEP this year? - Who is expecting to travel to CHEP? - We need to gather information for funding as there is also a WLCG workshop associated with CHEP 2012. - Tickets Starting the year with only 24 tickets. Nice. Only a few tickets caught my eye for various reasons (mostly benign), there's quite a few in which progress would only start being made again this week. 77646 is the only alarming ticket. DURHAM https://ggus.eu/ws/ticket_info.php?ticket=77646 Failed ops nagios tests. Needs a reply asap. https://ggus.eu/ws/ticket_info.php?ticket=76487 Shared area problems continue. Are the permissions alright on the new area? QMUL https://ggus.eu/ws/ticket_info.php?ticket=77959 Chris being ticketed over another batch of deletion errors at QMUL. This problem has its own savannah entry - https://savannah.cern.ch/bugs/?90131 . Chris has elected to keep the ticket open but on hold till a permenant fix to the issues can be found. GLASGOW https://ggus.eu/ws/ticket_info.php?ticket=77935 Checking all the files on a suspect server for atlas. I'm interested in what tools they're using to do the checksum checks? BIRMINGHAM https://ggus.eu/ws/ticket_info.php?ticket=77714 cvmfs area not set up correctly for atlas, waiting on the software chaps to get back from their holidays to kick it. In the event of emergencies like this how bad is it to mess with the $ATLAS_LOCAL_AREA by hand? https://ggus.eu/ws/ticket_info.php?ticket=76996 Intermittant missing library problem which has been discussed in private among affected sites. Do we need to open the discussion to a wider list (i.e. TB-SUPPORT).
      December stats
      November stats
    • 12:00 12:10
      Hardware 10m
      Updates/discussion - Network bids - Tier-2 hardware
    • 12:10 12:11
      AOB 1m