Deployment team and UKI workshop

Europe/London
Room 532

Room 532

Blackett Laboratory, Imperial College London, SW7 2BZ
Jeremy Coles
Description
- This is a workshop to: -- Improve understanding around UKI regional operations and responsibilities -- Develop GridPP deployment and operations priorities for the coming year Meeting is being held at: Blackett Laboratory, Imperial College London, SW7 2BZ
  • Thursday, 12 March
    • 10:00 10:10
      Introductions 10m
    • 10:10 11:30
      What regionalisation means for us 1h 20m
      - The on-duty tasks - Nagios and messaging - First line support
    • 11:30 11:45
      TEA/COFFEE BREAK 15m
    • 11:45 13:00
      On-duty tasks 1h 15m
      - The shifts and schedules - The dashboard - Alarm procedures - Ticketing sites and follow-up
    • 13:00 14:00
      LUNCH 1h
    • 14:00 15:00
      Nagios and messaging 1h
      - The goals of common messaging - Status in the UKI - Requirements from the UKI setup - Plans and next steps -- What functionality and by when -- Who is doing what and when?
    • 15:00 15:30
      First-line support 30m
      - The current system - The UKI approach
    • 15:30 15:50
      TEA/COFFEE BREAK 20m
    • 15:50 16:30
      Extended support 40m
      - The helpdesk interface - Support teams - How support teams operate
    • 16:30 16:45
      Summary & conclusions 15m
    • 09:30 09:40
      Review of regionalisation discussion and actions 10m
    • 09:40 09:55
      DTEAM meetings 15m
      - Structure of meetings - Team membership - Areas to improve
    • 09:55 10:10
      Deployment/operations problems & issues 15m
      - Look back at quarterly reports - Current known areas to be followed up
    • 10:10 10:30
      Communications 20m
      - Use of blogs - Current status of webpages - Areas of concern - Ticketing and metrics
    • 10:30 10:50
      TEA/COFFEE BREAK 20m
    • 10:50 12:00
      GridPP services 1h 10m
      - What we provide - Issues with the availability/use - WMS - tBDII - User requests - UI coverage - Monitoring tools - Testing new releases (deployment strategy with less PPS interaction) - Service resilience issues - Disaster planning
    • 12:00 13:00
      Current concerns 1h
      - SL5 - CREAM - Experiment software areas - Memory caps - Job efficiencies - Procedures for ... removing SEs...
    • 13:00 13:45
      LUNCH 45m
    • 13:45 14:30
      Experiment and user needs 45m
      - Site interactions with users - Review of problems from last 12 months - Areas yet to be resolved
    • 14:30 15:00
      Storage 30m
    • 15:00 15:20
      TEA/COFFEE BREAK 20m
    • 15:20 16:00
      Site tuning & improvement 40m
      - Lessons from experiment testing - Ramping up to cope with analysis needs - Upgrades and recommendations -- Best/worst performing sites
    • 16:00 16:15
      Conclusions & actions 15m