|
|
- This is the start of a biweekly meeting to review ops core tasks
- The intention is to run the meeting in EVO: http://evo.caltech.edu/evoGate/. Join the meeting in the GridPP Community area.
- The phone bridge number is +44 (0)161 306 6802 (CERN number +41 22 76 71400). The phone bridge ID is 5047622 with code: 4880.
Apologies: Daniela, Duncan, Mark M, Kashif, Pete
|
|
|
|
Wednesday, 4 April 2012
|
11:00
|
|
|
Documentation
(10')
|
Andrew/Steve
|
https://www.gridpp.ac.uk/wiki/Documentation
F2F
- States & updates
- Create schematic of our document life-cycle
- Checks for broken links (on web pages) - automate?
- Apply 80:20 rule. What documentation is useful? Standards/survey
- Highlight duplication
- Open dialogue with EGI documentation group
- Assign review task areas to members of core-ops group
- Tagging pages
- Templates
17th Feb
- schematic
- VOMS data key document
- Content not in twiki but will be
- Basic operations absent
- Workflow of job & how info system works
- Cheat sheet
13th March:
- Update mediawiki / link to relevant documentation for current version
4th April:
-> Is the documentation starting to improve!?
|
|
|
11:10
|
|
|
Monitoring
(10')
|
David
|
https://www.gridpp.ac.uk/wiki/Monitoring
F2F
- Review monitoring tools (what is needed and what is useful)
- Provide feedback on WLCG dashboards
- Recommendations for internal monitoring (local nagios vs Ganglia) and pull together best practices
- Better define monitoring and regional tools
- Devise plans on and advise sites on perfsonar (LHCONE)
- GridMon resurrection and best use
- Sites dashboards (ala Glasgow example)
Comments:
1) What is the state of GridMon?
2) Do most people understand LHCONE aims?
3) DRI purchases will need tuning
4) What feedback has been given on dashboards?
5) We need more guidance to sites on areas like http://dashb-siteview.cern.ch/templates/siteview/index.html.
17th Feb
- scope how best to do things (input from TEGs)
- Put up pages for local tools
- (internal monitoring wiki)
- Observation - too many pages in this area
- Place most useful information behind site status dashboard
- Issue with contradictory results (provide examples)
- Current monitoring often little information about what is wrong if site failing
- Is there a document listing the tests sites are expected to pass
4th April
-> What are the main recommendations from the TEG?
|
|
|
11:20
|
|
|
Accounting
(5')
|
Alessandra/Rob
|
https://www.gridpp.ac.uk/wiki/Accounting
F2F:
- Tracking APEL issues
- Rules of publishing
- Recommendations on gLite cluster and configuration
- Cross-checking site publishing
- SL metrics vs HS06 APEL
- Keeping a record of accounting issues
.. see also the wiki page
Comments:
1) Storage accounting is hot topic in EGI
17th Feb
- GridPP metrics need revision
- Capture reasons in wiki for others (simulation under fast/full labelled with same tag etc. but has large impact on simulation weights)
- SL6 tests & benchmarking
March
- Storage accounting for all VOs. Snapshot.
4th April
-> Data on metrics (input for joint PMB-ops discussion)
-> News on latest benchmarking results
|
|
|
11:25
|
|
|
Security
(10')
|
|
https://www.gridpp.ac.uk/wiki/Security
- Next SSC end March (pending pilot and tools test)
- Self audit and best practice
- glexec directions
- Policies review
- Contribution to security TEG
- Default s/w configurations (are they secure mySQL for example needs further locking down)
- UIs best practice
- Review of open ports required
- Support incident follow-up
15th March - plan to hold UK NGI security team meeting
4th April
-> Has there been any UK NGI meeting or other progress to note?
|
|
|
11:35
|
|
|
Staged rollout
(5')
|
|
https://www.gridpp.ac.uk/wiki/Staged_rollout
24th February:
- Automated page (query BDII?) for releases at sites
- Additional column in current table
March
- Additional sites involved (SL6?)
UK status update - http://www.hep.ph.ic.ac.uk/~dbauer/grid/staged_rollout.html
4th April
-> From Daniela
"I've written the shell script from hell that pulls all CE/SE/WMS version off the bdiis (I failed on the bdii itself for the more obvious reason that EMI bdiis do no broadcast their implementation version). I can get the glite versions of the WNs from the WNs themselves as long as the site supports dteam. I haven't tested it on
the EMI WN yet.
I can make an updated page when I come back.
Having said this, in the process of writing this script I noticed the glite 3.1 bdiis we still have in the UK, it would help if all sites would move on at some point.....
|
|
|
11:40
|
|
|
Ticket follow-up
(5')
|
|
https://www.gridpp.ac.uk/wiki/Ticket_follow-up
4th April
-> Area seems stable. What other improvements can we make here?
|
|
|
11:45
|
|
|
Core services
(5')
|
|
https://www.gridpp.ac.uk/wiki/Core_Grid_services
- Overlap with VO submission tests framework
- Proxy renewal follow-up?
DR - https://perfsonar.usatlas.bnl.gov:8443/exda/?page=25&cloudName=UK
4th April
-> Perfsonar results on 3rd looked 'odd'.
-> Plans for removing 3.1 BDIIs?
|
|
|
11:50
|
|
|
Wider VOs
(10')
|
|
https://www.gridpp.ac.uk/wiki/Wider_VO_issues
- Setting up new VO
- VO requirements (efficiencies of jobs)
4th April
-> Feedback from EGI UF
-> Issue with VOMS updates for new certs
-> Tool for extracting VO configuration data
|
|
|
12:00
|
|
|
Regional tools
(5')
|
|
- Resolve ops membership
- Document backup strategy
- Catch-all VO tests
- DIRAC for 'others'
4th April
-> Feedback and updates from EGI UF last week
|
|
|
12:05
|
|
|
Interoperation
(5')
|
|
https://www.gridpp.ac.uk/wiki/Grid_interoperation
Surveys.
6th March:
- Mapper
- SARONGS
4th April
-> Survey feedback as expected
-> Plans for GDB style meeting for EGI VOs
|
|
|
12:10
|
|
|
AOB
(5')
|
|
- Check talks at http://www.gridpp.ac.uk/gridpp28/programme.html
[Perhaps Max 5 slides: 2 slides background, 2 slides plans, 1 slide discussion topics]
14:00-14:10 Introduction - Jeremy Coles
14:10-14:20 Regional Tools - Kashif Mohammad
14:20-14:30 Staged Rollout - Daniela Bauer
14:30-14:40 Ticket Follow-up - Matt Doidge
14:40-14:50 Website - Andrew McNab
14:50-15:00 Documentation - Stephen Jones
15:00-15:10 Monitoring - David Crooks
15:10-15:20 Core Services - Mark Mitchell <<-
15:20-15:30 Grid Interoperation & ROD - Stuart Purdie
Switch core services for accounting?
Also on Wednesday
10:15-10:30 Other VOs Chris Walker
11:00-11:30 Storage Directions and Issues - Wahid Bhimji
11:30-11:50 Networking Issues - Stuart Purdie
11:50-12:10 End-to-end Performance Tuning - Brian Davies
12:10-12:45 Many-Core Performance and Operations - Andrew Washbrook
15:00-15:30 Security-The BIG picture - Mingchao Ma
|
|
|
Share this page
Social networks
Calendaring