Core-ops tasks

Europe/London
EVO - GridPP Operations team meeting

EVO - GridPP Operations team meeting

Description
- This a meeting for the review of the ops core tasks - The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the Janet(UK) Community area. - Direct EVO link: http://evo.caltech.edu/evoNext/koala.jnlp?meeting=MeMMMu2I2BDDDI929sDv9e - The phone bridge number is +44 131 474 4520 (CERN number +41 22 76 71400). The phone bridge ID is 646 9755 with code: 4880. Apologies: Mark M
    • 11:00 11:10
      Documentation 10m
      https://www.gridpp.ac.uk/wiki/Documentation Done in Q1: - Updates on 'stale' pages - Process to remove pages - Document cull In progress: - VO admin guide to VOMS - Keydocs review - Updates according to https://www.gridpp.ac.uk/php/KeyDocs.php What next? 1) Go over the use cases for certificate renewal including Browser Based and CertWizard. Document findings. 2) Investigate status of existing documents on that topic, and harmonise. .
    • 11:10 11:15
      Monitoring 5m
      https://www.gridpp.ac.uk/wiki/Monitoring Done in Q1: - Packaged monitoring? - Write up of monitoring scripts? In progress: - Web page for rankings - Understanding what's important to community What's next? - Investigate more active alert options - Write-up and share scripts (e.g. related to temperature - monitoring differences between nodes/motherboards...) - Branch out to configuration management - WLCG developments with meta-monitoring - Overlap with cloud implementations
    • 11:15 11:20
      Staged rollout 5m
      https://www.gridpp.ac.uk/wiki/Staged_rollout Done in Q1 : - Regular updates - EMI-1 to EMI-2 transition - First look at EMI-3 In progress: - SL6 WNs - EMI-3 testing: https://www.gridpp.ac.uk/wiki/Staged_rollout_emi3? (are the current contributions sufficient?) What's next? - Getting active sites to report (such as on DPM) Concerns:
    • 11:20 11:25
      Core services 5m
      https://www.gridpp.ac.uk/wiki/Core_Grid_services Done in Q1: - Mesh testing In progress: - UKCMS cloud is being set up - Rates analysis - Glasgow testbed What's next? - IPv6 testbed (inform and coordinate) - Record/understand/disseminate network issues found (e.g. RAL vs distant sites) - Setting up 'virtual site' for core service monitoring (the benefit?)
    • 11:25 11:30
      Wider VOs 5m
      https://www.gridpp.ac.uk/wiki/Wider_VO_issues Done in Q1: - Other VO using Tier-1 CVMFS - NGS VOMS server transition (lots of effort required) - Enablement of EarthSci. - VO monitoring - has made significant difference to T2k support. A little prodding here and there has hugely increased the number of CEs that work. This was presented at GridPP30 - WebDav - demo: Presented at GridPP30 In progress: - Webpage summary of git vs other methods for s/w updates (VO reference doc) - Ganga integration for smaller VOs (done for some) - WebDAV? https://www.gridpp.ac.uk/wiki/WebDAV#Federated_storage_support - Dirac server (Janucz is making progress on this). - Catalogue sync (waiting on Janucz) What's next?/Issues: - No 'quick start' documentation - Pushing VOs to SRM. What else can we do? - WebDAV? https://www.gridpp.ac.uk/wiki/WebDAV#Federated_storage_support - EMI changes - UI (format changes), future of WMS and LFC - Encouraging wider engagement (impact) - Understand needs vs resources available?
    • 11:30 11:35
      Regional tools 5m
      Done in Q1: - Backup VOMS configured - Nagios alerts enabled for ?? In Progress: - DIRAC/Ganga for smaller VOs. What's next/Issues: - Better approach to changing active Nagios
    • 11:35 11:40
      Interoperation 5m
      https://www.gridpp.ac.uk/wiki/Grid_interoperation Done in Q1: - Continued participation in WLCG Operations Coordination Team (still Ian, Alessandra, Jeremy). Alessandra has an expanded role (chair + TF leader) - DPM community - Community now active In progress: - documenting experiences from SARonNGS - other areas of interest from NGS such as the certwizard (what is it + how to use) What's next/Issues - DC cloud work -> Push for the wider meeting? - Post EMI engagement
    • 11:40 11:45
      Security 5m
      https://www.gridpp.ac.uk/wiki/Security Done in Q1: * The NGI Security team/GridPP ops security task team has continued a backup rota * Dealing with obsoleted and unsupported gLite middleware. * SSC6 In progress: * On-duty work What's next/Issues: * Contributing to EGI security duties (we currently don't). * Arranging a UK run of SSC6 at remaining sites.
    • 11:45 11:50
      Discussion/Other Areas 5m
      - Push on site data publishing (http://gstat2.grid.sinica.edu.tw/gstat/summary/EGI_NGI/NGI_UK/) and glue2 validator - Further contributions to the WLCG Operations Coordination Team and checking areas are addressed https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsMinutes130124. - glexec and ARGUS enablement at sites - WN tarball future and example (backup to Matt needed?) - Integration of testbed resources
    • 11:50 11:55
      Accounting 5m
      https://www.gridpp.ac.uk/wiki/Accounting Done in Q1: In Progress: - Understanding updates to storage algorithm - Changing algorithm to accommodate LHCb request(s) for more T2 disk resources What's next? - Review HS06 figures at each site - Broaden area to 'New technologies and impacts'? (e.g. use of whole node or cloud scheduling, impacts of many core....)
    • 11:55 12:00
      Ticket follow-up 5m
      https://www.gridpp.ac.uk/wiki/Ticket_follow-up Done in Q1: - Continued weekly reporting procedure - Checking solved cases and reporting back - Highlighting "of interest" tickets In progress: - Weekly reporting What's next?/Issues; - No good way to easily to list all tickets submitted by UK NGI members. -The work is top-loaded at the start of the week, often leaving the tickets neglected for the latter half if Lancaster is suffering interesting times. -The ticket section only includes GGUS and directly linked Savannah tickets at the moment - there might be need to cover Savannah more thoroughly. (Savannah use is ending)
    • 12:00 12:01
      AOB 1m
      - Tentative next meeting date 20th June.