DATAGRID CONFERENCE @ BARCELONA

 

Minutes of the WP4 day, 13/05/2003

 

Attendance: Andrea Chierici (INFN), Enrico Ferro (INFN), Michele Michelotto (INFN), Jan van Eldik (CERN), Piotr Poznanski (CERN), Sylvain Chapeland (CERN), German Cancio (CERN), Maite Barroso (CERN), Lord Hess (KIP), Volker Lindenstruth (KIP), Gilbert Grosdidier (LCG, Test Group), David Groep (NIKHEF), Martijn Steenbakkers (NIKHEF), Gerben Venekamp (NIKHEF), Alexander Holt (University of Edinburgh), Ariel Garcia (FZK - Forschungszentrum Karlsruhe, Crossgrid), Cal Loomis (LAL-Orsay, WP6), Rafael Garcia Leiva (Universidad Autonoma Madrid), Andrew Washbrook (University of Liverpool)

 

General information from the 1st day presentations:

  • Theme for the conference: EDG 2.0, release process and support, plan to completion and beyond
  • points for final release:
    • scalability & stability
    • gcc 3.2.2
    • MPI support
    • VOMS and security (LCMAPS, ACLs…)
    • RH8/9 UI and WN
    • VDT update
    • New functionality needed to adhere to technical annex
  • Functionality freeze end of September

 

Configuration Management Task Report (Piotr Poznanski)

  • Components of Configuration mgt framework presented.
  • Discussion on configuration for testbed 3.0; conclusion: compulsory to write LCFGng components, because it will be maintained in the testbed till the end of the project. Configuration with Pan left for October (or after integration, whenever it happens), but also compulsory for all WP4 components.
  • LCG? They don’t force to use any tool for fabric management, choice of the site. LCFG light will be proposed for LCG-1 non-WN, manual installations will also be available. For WN @ Cern new installation and configuration tools being prototyped at the moment, plans to use them. The way of getting WP4 tools widely accepted and used: dissemination.

 

Fault Tolerance Task Report (Lord Hess)

  • FT global server not finished yet; to be used when the local node cannot repair itself it is escalated to this node.
  • Idea under burn-in tests? To test reliability of the node. If you put a node into a cluster, you check if the disk is available, what the data rate is, is all memory accessible… stressing every part of the hardware. Verify basic operation of the machine. These tests are started manually, and should be run exclusively in a node, without other tasks running in parallel.
  • Integration? September

 

Gridification Task Report (Martijn Steenbakkers)

  • MPI: FabNat is used for this, still needed? MPI inter or intra fabrics? It was originally requested by WP10. Crossgrid has a requirement for Intragrid MPI. In EDG, WP1 will only provide MPI inside a fabric. Action on Piotr and Maite to check in the ATF and WP mgrs to take a decision on this.

Maite checked with Bob and only MPI inside the fabric is needed, so no need for MPI. Lowest priority.

  • FLIDS: originally requirements coming from a use case in the WP4 architecture document.  There are alternate ways of generating key pairs in hostile environments (at Cern they do it at installation time). Lowest priority, we think it is not needed any more.

 

Installation Task Report (German Cancio)

  • No question/comments/issues.

 

Monitoring Task Report (Jan van Eldik)

  • No question/comments/issues.

 

Resource Management Task Report (Maite Barroso)

  • What are the "advance scheduling" features mentioned for the RMS?
  • Plan to provide advance reservations?
  • Timeline for LSF and Condor support?
  • David Groep and Andrea Chierici volunteered to switch on the RMS and test it at their sites (NIKHEF development testbed and CNAF) once EDG 2.0 is deployed.
  • Many requests to provide accounting as a separate module; today it is part of the RMS. Estimated effort?
  • Maite will follow up with Thomas all these points.

 

ATF Report (Piotr Poznanski)

 

Integration Report (Sylvain Chapeland)

  • All sw for last integration must be ready 1st September.
  • Final integration will take place during the month of September. If some components are ready before (e.g. LCMAPS) they can be integrated during the summer.
  • Next release (2.1, testbed3) will be based on gcc 3.2.2 compiler. Start porting the middleware, if needed. Autobuild already set up to check if it works; have a look at:

http://datagrid.in2p3.fr/autobuild//rh7.3-gcc3.2.2/

 

CERN deployment of WP4 tools (German Cancio and Jan van Eldik)

  • discussion on accessing monitoring data from the central repository from outside the farm (Gilbert Grossdidier). This could be possible via grid monitoring. Not all fabric monitoring information can be populated to the grid level, mainly for security reasons.

 

ACTIONS (to be followed up at WP4 telephone conferences):

  • Lord: to store last version of FT mw in cvs, and make it work with autobuild (gcc 3.2.2 compiler). Deadline: June.
  • All tasks: start preparing migration to gcc 3.2.2
  • All tasks (driven by Piotr): package external interfaces separately according to developers’ guidelines.
  • David G. (NIKHEF) and Andrea C. (CNAF): switch on the RMS at their sites once the release EDG2.0 is deployed so that it can be tested in a real environment.