# Grid PP Deployment Board # Day 2 Glasgow, 2005-06-02 http://agenda.cern.ch/fullAgenda.php?ida=a053292 ## Present ## David Smith, Dave Kelsey, Andrew McNab, Steve Fisher, Barney Garrett, Alessandra Forti, John Walsh, Robin Middleton, Jeremy Coles, Pete Gronbech, Dave Colling, Andrew Sansum, Stephen Pickles, Roger Jones, Tony Doyle, Pete Clarke, John Gordon, Jens Jensen, Fraser Speirs, Owen Maroney. ## Jeremy Coles: Deployment Issues and SC3 ## - JC presented the results of the Bologna workshop (see presentation on agenda). ### gLite ### - There was some discussion of LFC vs. Fireman. JG mentioned that the experiments choose. JC said that only LHCb had expressed an interest in Fireman. - JC said that co-existence of LCG2 and gLite was something that we should try to be avoided. - There was a more detailed explanation of the possibility of migration to gLite, whether by coexistence or gradual replacement. - DK asked how the decision between coexistence and gradual replacement was to be made. JC said he didn't yet know. JG said that a discussion was needed about whether T1s would do this migration. SC3 remains the priority. ### Accounting (DGAS) ### - JG said that DGAS was a good solution for 'banking'. No higher-level integration of all data. - RM asked if there were any plans for doing storage accounting. JG replied that GridICE may have some use for that. TD said that OMII might have a useful accounting component. SP said that acounting wasn't really spearable from the rest of OMII and that the ideas might be more portable than the implementation. ### Site Functional Tests ### - DS mentioned that the filtering of Freedom-of-Choice-approved sites in BDIIs is done at the ACL level, so VOs don't necessarily have to run their own BDII. ### JRA4 Networking Request ### - PC said that networking experts need engagement from grid. - PC said that JRA4 don't want to invent new tools. They want to standardise the interface under the GGF network monitoring standard (web services), and this is mostly done. Multiple backend monitoring tools work under this interface. - Creating a new tool for visualising/diagnosing networks is not necessary. Ops centres will define where this will be plugged in. - TD asked about cost-based network optimisation. PC said that nobody was working on that, but DC mentioned that some effort was ongoing in Network Reservation. - PC suggested that EPCC people go to UK/I ROC and talk to them there. JG agreed with this. - JG said that there was no existing tool in GridPP that we could plug this in. PC said that netmon exists. TD said that Robin Tasker has a plan for deploying such a tool. - AS said that there was an urgent need for network monitoring and diagnostics. ## Andrew Sansum: Tier 1/A ## - See presentation in agenda. - The human coordination problem in networking is very challenging. ## Stephen Pickles: NGS Update ## - JJ described the recent CA Update. - SF requested as much rapid feedback on gLite as possible from NGS work. - NGS needs a strong story about migration of globus-based user workflow to gLite. - VOMS and RB are being looked at for adoption from gLite. ## David Smith: LCG Plans ## - Next release for SC3 sites ASAP. Normal schedule for other sites. - Will include certified gLite components (CE+RB likely) - Will probably not be called LCG 2.5.0 - Local file catalogs are experiment dependent (one node per VO?). RJ said that a 1:1 ratio of boxes to VOs is too much. - There was a discussion about VO agents. DS said that it was critical for SC3 that it was clarified. - DS asked for a list of some things which he could clarify at CERN and some were provided. - gLite deployment - xrootd status - JC asked for Freedom of Choice to show why sites failed - DK asked about OS support and the dependence of middleware on OSes. RJ mentioned that experiments are trying to address heterogeneity in application code. - TD asked about the 64-bit roadmap. - JC asked about library requirements for experiments' code. ## John Walsh: Grid Ireland ## - Detail in presentation - JG asked why R-GMA isn't shared with UK (would help with Ireland being seen as part of EGEE). JW said that RGMA is part of national .ie infrastructure. - There was some discussion of Ireland's integration with the rest of the grid. ## Robin Middleton: Tier-2 MSN Support ## - Integrated logbooks were suggested to be better than separate ones. - VO/Sec through AMcN. - Quarterly reports shouldn't be too heavy unless developers can/will make use of the feedback they provide. - Traffic statistics and requirements document is for UKERNA audience. - PC said that the person responsible for these quarterly reports is on long-term sickleave. - Barney has been re-purposed to do the critical stuff, but there's no extra work to do additional logbooks. - DC asked how we deal with experiment requests. RM said they have to be justified under the headings in "WMS" slide. DK said they have to be written into the requirements and agreed. RM said that additions can be made, but if it becomes an enterprise in its own right, it has to be agreed again. JG said that changes should be negotiated. - JG said that there needs to be a contact point/person for the UK community. ## Steve Fisher: Information and Monitoring ## - No big incompatible changes as RGMA evolves. - YAIM installation support in addition to gLite deployment modules. - JG asked about registry crashing bugfix. SF said that it's fixed. - JG 'work towards an installation that works out the box'. - PC asked about other Edinburgh specific issues. ## Andrew McNab: Security Monitoring Boxes ## - OS patching 90% of solution - Vulnerable to repeat attacks across the grid - RM asked who 'owns' this. AMcN said that there's a hierarchy for automated triggering of alerts and that it does transcend the admin's boundary. - SP asked about privacy. AMcN said that a site could run a box for their own purposes without publishing anything outside. - DK: GridPP deployment plans still to be discussed. - JW suggested using SELinux policies. AMcN said that read/write is separated between Apache and Syslog respectively. - JJ asked about things that didn't use syslog (dCache!). Considered "Phase 2". ## Dave Colling: Workload Management Activity ## - (No presentation in agenda) - GridEngine port ongoing - gLite Workload Management System testing ongoing. Now part of official testing activity. 2 clusters for testing. - Looking at extensions of gLite WNS for complex workflows. - Workflow manager on top of workload manager. Architecture document exists. - Agreement service can be used as a reservation system. ## Dave Kelsey: Security Operations ## - SF and SP asked about logging/accounting and data protection act. BG pointed out that this is "acceptable use" of the user not "terms and conditions of use" of the service. - RJ asked who would give authority to this. DK said that he expected the collaboration board of LHC experiments would do so. Smaller VOs are a slightly looser case. - PC pointed out that 'cutting someone off' opened the grid to legal challenge. That AUPs should be enforced by legal entities with lawyers. - AMcN pointed out that it's like anti-spam in that the sites only allow in users approved by VOs, not that they kick people out explicitly. - AMcN raised the issue of being "jointly and severally" liable by forming an "unincorporated organisation". ## Roger Jones: Application Issues ## - Experiments really need access control and resource usage accounting at the level of sub-VO groups. - DC asked about the status of FTS. Said to be out "this week". CMS are not using it for SC3. - RJ said that current data management tools don't handle small files well. JC said this is a big problem for biomed users. - There was a discussion of what these small files actually are. - Data access patterns are being derived from computing models. Important for storage. - SP asked about longevity of storage elements. RJ said that it had to be local responsibility to look after data. JG said that there were still open questions about how site admins can re-register data in experiment catalogs pointing at new SEs. ## Jeremy Coles: Service Challenge 3 ## - There was some discussion about the CPU/Disk split between Glasgow and Edinburgh. JC said that it's not clear what LHCb are going to do at Edinburgh given they only have 3 CPUs. ## AOB ## - JG asked for more feedback on his issues with documentation and security on RH73. DK acknowledged the issues. ## Thanks ## - Thanks to Glasgow for hosting - Thanks to Owen and best wishes for the future. -------------------- Fraser Speirs Glasgow 3 Jun 2005