DATAGRID CONFERENCE @ BARCELONA
Minutes of the WP4 day, 13/05/2003
Attendance:
Andrea Chierici (INFN), Enrico
Ferro (INFN), Michele Michelotto (INFN), Jan van Eldik (CERN), Piotr Poznanski (CERN), Sylvain Chapeland
(CERN), German Cancio (CERN), Maite Barroso (CERN),
Lord Hess (KIP), Volker Lindenstruth (KIP), Gilbert Grosdidier (LCG, Test Group), David Groep
(NIKHEF), Martijn Steenbakkers
(NIKHEF), Gerben Venekamp
(NIKHEF), Alexander Holt (University of Edinburgh), Ariel Garcia (FZK - Forschungszentrum Karlsruhe, Crossgrid), Cal Loomis (LAL-Orsay,
WP6), Rafael Garcia Leiva (Universidad Autonoma Madrid), Andrew Washbrook
(University of Liverpool)
General information from the 1st day
presentations:
- Theme
for the conference: EDG 2.0, release process and support, plan to
completion and beyond
- points
for final release:
- scalability
& stability
- gcc 3.2.2
- MPI
support
- VOMS
and security (LCMAPS, ACLs…)
- RH8/9
UI and WN
- VDT
update
- New
functionality needed to adhere to technical annex
- Functionality
freeze end of September
Configuration Management Task Report (Piotr
Poznanski)
- Components
of Configuration mgt framework presented.
- Discussion
on configuration for testbed 3.0; conclusion: compulsory to write LCFGng
components, because it will be maintained in the testbed till the end of
the project. Configuration with Pan left for October (or after
integration, whenever it happens), but also compulsory for all WP4
components.
- LCG?
They don’t force to use any tool for fabric management, choice of the site.
LCFG light will be proposed for LCG-1 non-WN, manual installations will
also be available. For WN @ Cern new
installation and configuration tools being prototyped at the moment, plans
to use them. The way of getting WP4 tools widely accepted and used: dissemination.
Fault Tolerance Task Report (Lord Hess)
- FT
global server not finished yet; to be used when the local node cannot
repair itself it is escalated to this node.
- Idea
under burn-in tests? To test reliability of the node. If you put a node
into a cluster, you check if the disk is available, what the data rate is,
is all memory accessible… stressing every part of the hardware. Verify
basic operation of the machine. These tests are started manually, and
should be run exclusively in a node, without other tasks running in
parallel.
- Integration?
September
Gridification Task Report (Martijn Steenbakkers)
- MPI:
FabNat is used for this, still needed? MPI inter
or intra fabrics? It was originally requested by WP10. Crossgrid
has a requirement for Intragrid MPI. In EDG, WP1
will only provide MPI inside a fabric. Action on Piotr
and Maite to check in the ATF and WP mgrs to take a decision on this.
Maite checked with Bob and only
MPI inside the fabric is needed, so no need for MPI. Lowest
priority.
- FLIDS:
originally requirements coming from a use case in the WP4 architecture
document. There are alternate ways
of generating key pairs in hostile environments (at Cern
they do it at installation time). Lowest priority, we think it is not
needed any more.
Installation Task Report (German Cancio)
- No
question/comments/issues.
Monitoring Task Report (Jan van Eldik)
- No
question/comments/issues.
Resource Management Task Report (Maite Barroso)
- What are the "advance
scheduling" features mentioned for the RMS?
- Plan to provide advance
reservations?
- Timeline for LSF and
Condor support?
- David
Groep and Andrea Chierici
volunteered to switch on the RMS and test it at their sites (NIKHEF
development testbed and CNAF) once EDG 2.0 is deployed.
- Many
requests to provide accounting as a separate module; today it is part of
the RMS. Estimated effort?
- Maite
will follow up with Thomas all these points.
ATF Report (Piotr Poznanski)
Integration Report (Sylvain Chapeland)
- All
sw for last integration must be ready 1st
September.
- Final
integration will take place during the month of September. If some
components are ready before (e.g. LCMAPS) they can be integrated during
the summer.
- Next
release (2.1, testbed3) will be based on gcc
3.2.2 compiler. Start porting the middleware, if needed. Autobuild already set up to check if it works; have a
look at:
http://datagrid.in2p3.fr/autobuild//rh7.3-gcc3.2.2/
CERN deployment of WP4 tools (German Cancio
and Jan van Eldik)
- discussion on accessing monitoring data from the
central repository from outside the farm (Gilbert Grossdidier).
This could be possible via grid monitoring. Not all fabric monitoring
information can be populated to the grid level, mainly for security
reasons.
ACTIONS (to be followed up at WP4 telephone conferences):
- Lord:
to store last version of FT mw in cvs, and make it work with autobuild (gcc 3.2.2
compiler). Deadline: June.
- All
tasks: start preparing migration to gcc 3.2.2
- All
tasks (driven by Piotr): package external
interfaces separately according to developers’ guidelines.
- David
G. (NIKHEF) and Andrea C. (CNAF): switch on the RMS at their sites once
the release EDG2.0 is deployed so that it can be tested in a real
environment.