DRAFT Minutes of the LCG SC2 meeting, 25/11/05

Present: Jean-Jacques Blaising, German Cancio (secretary), Tony Doyle (via VRVS), Matthias Kasemann (chair), Marcel Kunze, Eric Lan�on, Gerhard Raven (via VRVS), Les Robertson
Apologies: Wisla Carena, Jim Shank, Albert de Roeck

Minutes of the LCG SC2 meeting, 25/11/05. 1

Organisational matters. 1

News from the MB.. 1

LCG Status Report Review (Q3 2005) 2

Middleware Area. 2

Applications Area. 3

Fabric Area. 3

Grid Deployment Area. 4

General comments: 4

AOB.. 5

Organisational matters

  • The previous minutes (link) were approved.
  • Next SC2 meeting (Fabric Area focus meeting): Since the LCG project, including the Fabric Area, has been recently subject to a comprehensive review by the LHCC, it is decided to cancel the next SC2 meeting scheduled for Friday December 16.

News from the MB

  • Les reports from the Management Board held on November 22.
  • The feedback from the Comprehensive Review was discussed. The subjects included (see here for details): Lack of quantitative data for LCG; CASTOR2 issues; 3D project status.
  • VO boxes: A meeting with the experiments will be organized in the second half of January, in order to clarify issues related to the VO boxes.
  • In response to a question by Matthias, Les confirms that the evolutionary approach, which was chosen for migrating from LCG-2 to gLite, is considered a success. All required EGEE components are now part of the LCG software distribution, with the exception of the Workload Management System that will be integrated in the near future. The LCG software distribution will be renamed to gLite, and will contain components from different sources like EGEE, VDT and LCG.
  • The new ETICS project led by Alberto di Meglio is aiming for a standard packaging and distribution system. Its relationship with existing solutions like Yaim needs to be further investigated.
  • Service Challenges: The failed SC3 throughput phase will be repeated in January. For this exercise, a new version of dCache will be used. The service phase is progressing reasonably well.
  • CASTOR2: A Readiness Review is planned for March 2006. No major problems have appeared for the last 3-4 weeks. Matthias asks after the status of the �small file� problem. The current approach followed by the experiments is to concatenate small files prior to writing them to CASTOR.
  • 3D project: A plan exists for deploying 3D. While services will be available already in March, the production status will be reached by September. The setup will vary by experiment: CMS is looking at FroNtier and Squid, whereas ATLAS/LHCb will use Oracle Streams. Tier-1 sites might have to support both solutions.
  • It has been recommended to disband ARDA and to replace it by a more general support area for experiments.
  • The relationship between EGEE and OSG has improved significantly. There is not only a good attendance by OSG people in the LCG MB but also efficient technical contacts between both projects. Via its participating sites, OSG will report indirectly to LCG. Also, the NorduGrid Facility will interface to LCG using gateways, which is being discussed in the GGF. NorduGrid participated in the SC3 throughput phase with good success.
  • The MB Working Group, which addressed reporting, monitoring and internal reviewing in LCG, has concluded its work.
    • A tabular, more detailed reporting format will now be setup.
    • Milestones for regional centers will be included. Experiments will report through their respective Task Forces.
    • The reporting will be monthly; a summary will be produced on a quarterly basis.
    • There will be a series of ad-hoc internal reviews with external participation; the first review will target CASTOR in March.
    • The Applications Area will continue with regular internal reviews.
    • There will be 1-4 annual service readiness reviews; the first one take place prior to the startup of SC4.
  • The first MoU is likely to be signed by Academia Sinica (Taiwan) on December the 6th.

LCG Status Report Review (Q3 2005)

Middleware Area

  • Tony reports about his e-mail interaction with Frederic and Massimo.
  • AliEn: Tony mentions that AliEn has been adopted by the PANDA experiment (GSI). Matthias emphasizes that there is no maintenance or support agreement with CERN or LCG.
  • ARDA:
    • With respect to the move of ARDA activities towards the umbrella of experiments, Les explains that ARDA is funded 50% by EGEE and 50% by CERN. On one hand, the experiments request that the work done by this team should not be restricted to distributed analysis. On the other hand, a tighter integration with the work of the EIS team is desired. Thanks to a contribution by INFN, the size of the EIS team will be increased from four to eight FTE�s.
    • In LCG phase I, two ARDA people were assigned per experiment. This model has been subject to critics, because in some experiments, it took a long time to converge on useful work. In order to improve efficiency, a new model has been agreed for phase II. A total of 12 FTE�s (6 funded by CERN and 6 funded by EGEE) will be working on common, cross-experiment solutions. Examples of such solutions are GANGA, PROOF or �VO box� services. Massimo is now talking to the experiments; convergence needs to be reached prior to the deadline for the EGEE-2 Technical Annex.
  • Progress has been achieved on the Pre-Production Service.
  • Tony considers the Distributed Analysis area being of the greatest concern. As an example, Tony mentions the User Interfaces. The current GUI�s are very large, in particular when comparing them with the light-weight and modular packaging of NorduGrid equivalents.
  • Les informs that Claudio Grandi (INFN) has been appointed as the new Middleware Area manager succeeding Frederic Hemmer.

Applications Area

  • ROOT-SEAL merger: Gerhard highlights that the integration is going in the right direction. A support agreement has been reached; only a few SEAL libraries need to be maintained independently from ROOT.
  • Manpower in SPI: The manpower reduction results from the move of manpower exclusively dedicated to SPI to more shared roles. Gerhard considers it appropriate to have developers participating part-time to common infrastructure activities, since this encourages developers to use common tools.
  • Jean-Jacques highlights that at the last AA meeting, an agreement has been reached to provide a ROOT-independent Linear Algebra module, which consists of the MathCore and the �small matrices� package, by December.

Fabric Area

        Missing milestones: Two milestones listed in the Q2 report (1.2.3.6 � �Sub-station commissioned� and 1.2.1.4.1 � �Definition of T0 building blocks�) are not found in the Q3 report. Bernd clarified by e-mail that 1.2.3.6 was fulfilled on September 30, and that the date assigned for 1.2.1.4.1 in the Q2 report was wrong: its real due date was November 30 (link).

        Proposed new milestones: A milestone for the interface between CASTOR and the DAQ needs to be added, which is already taken into account by the new milestone draft. With regard to the requested milestone for the ALICE DCs, Bernd considers this being a level-3 milestone, which is thus not shown in the Quarterly Report.

        CASTOR-2 is of concern, because it gives the impression of not yet being fully production quality. Marcel and Wisla suggested to break up the CASTOR milestones into smaller steps. This should include not only milestones on functionality but also on testing and stability in order to harden the product. Les clarifies that the goal is to get CASTOR-2 ready for production by involving the experiments and this will include the identification of contact persons in the experiments, the agreement of schedules and the definition of corresponding milestones. No further problems have been discovered since more than one month, but the experiments need to make a heavier usage of CASTOR-2. In particular, the experiment production activities still need to be migrated from CASTOR-1 to CASTOR2. The problem-fixing rate is considered as very good, but there are still performance issues to be looked at. The 750MB/s challenge will be completed before Christmas and will use newly arrived IBM equipment that is being installed and configured. As a consequence of an IT-internal reorganization, the CASTOR and ELFms development teams have been joined, and vacancies for CASTOR development are being prepared.

        Eric points out that there is a cost increase of up to 20% because of the augmented memory requirements for experiment jobs. The maximum size has gone up from 1GB (as planned for in the TDR) to 2 GB. Les informs that there are ongoing discussions on this subject in the GDB.

Grid Deployment Area

        Eric expresses his worries about the SC3 set-up milestone (1.4.7.2), because the full throughput goals were reportedly not met. He wonders whether a new milestone needs to be added for this.

        Reporting from OSG: Currently there is no reporting channel, but Ian clarified that this has been foreseen for phase II.

        OSG-EGEE interoperability: The submission from EGEE to OSG has been tested, but the required reverse submission from OSG to EGEE is still outstanding. Les explains that the schedule for interoperability milestones has been discussed during a joint OSG-EGEE workshop in September.

        3D service: The plans for the 3D service (see also News from MB section) have been discussed at the last GDB meeting. The experiments still need to define their plans on how and when to use 3D; discussions for SC4 planning will take place at the Mumbai meeting in February. One site has expressed objections about running both, FroNtier/Squid and Oracle Streams.

General comments:

  • The renaming of the LCG middleware distribution into �gLite� has created some confusion, in particular with regard to modules, which are part of the original gLite package but not part of LCG-2 (like FiReMan). It is felt that this name change has not been announced widely enough. Les informs that this decision was agreed at the EGEE Management Board meeting in Pisa and will be further discussed at the GDB meeting in January.
  • Points for the POB:
    • Applications Area: Most concerns from the last review have been addressed. The FLUKA situation has not yet been completely resolved.
    • Fabric Area: CASTOR remains a concern, even if good progress has been achieved.
    • The new project reporting structure will help to achieve a closer involvement of remote sites.

AOB

  • The work of the LCG SC2 committee ends with this meeting. Matthias expresses his thanks to the present and past SC2 members for their hard work and dedication.