Present: Jean-Jacques Blaising, German Cancio (secretary), Wisla Carena, Matthias Kasemann (chair), Pere
Mato, Eric Lançon, Gerhard Raven, Les Robertson, Jim Shank (via VRVS)
Apologies: Marcel Kunze, Albert De Roeck
Absent: Frederico Ruggieri,
Lothar Bauerdick, Tony Doyle
Applications
Area (AA) Internal Review Closeout 1
Organisational
matters. 5
News from the PEB.. 5
Discussion points. 5
AOB.. 7
The LCG SC2 meeting was combined with the closeout of the
LCG Applications Area Internal Review (link to agenda
page).
- Jean-Jacques
reminds that the mandate of the AA internal review includes the following:
a) to examine the
progress that has been made since the last review, b) the adoption of the
recommendations and the preparation of the work program for the second phase
of the LCG project, c) to examine the overall coherence of the software,
d) to identify the real and potential problems and risks and to make
recommendations on the evolution.
- The experiments are satisfied
with the progress in the AA. The proposed integration of the ROOT and SEAL
projects is particularly welcomed.
- Most of the recommendations
made during the last review have been implemented
or are part of the proposed plan, which is considered technically
reasonable.
- Jean-Jacques recommends that
the SC2, and in particular the assigned godfathers, play an active role by
closely checking the progress of the Application Area not only at the time
of the Quarterly Reports but also in between.
- He also proposes to have
regularly extended AF meetings, which should include experiment and grid
service providers, in order to establish better communication and working
relationships (this was discussed by
the SC2 at a later stage, see below).
- A written report of the AA
internal review will be available by mid-April.
- The size of the GENSER
distribution is considered being too big; it is
recommended to consider more granular packaging distribution options.
- Concerns were
expressed regarding the support level given to HEPMC, in particular
in the areas of persistency and translators.
- Another concern is the planned
decrease in manpower in the Physics Validation
area (from 2.3 to 0.8 FTE). It is suggested that LCG should try to add manpower in case that GEANT4 considers this being a
CERN task.
- GEANT4, which has proven its
level of maturity, has become the main simulation engine for LHCb, ATLAS
and CMS. ALICE is encouraged to clarify its doubts
concerning hadronic physics.
- It is
recommended that SPI tools should be used for the distribution of
FLUKA similar to what is already done with other AA software.
- Simulation framework:
- Experiments are showing no
interest in having a common generic simulation framework. In case more
than one experiment expresses its interest, VMC will remain an option.
- Further development of GDML is
encouraged.
- GEANT4 Python interface: The
documentation should be improved. Also, an exchange of experiences with experiments that
are building similar solutions (like ATLAS) is suggested.
- Marco underlines the impressive
progress since the last review, with a widespread adoption of SPI tools by
experiments and projects. The build system is no longer an issue.
- Most recommendations have been implemented. In particular, there are already
visible benefits of having a central librarian in place.
- It is
recurrently observed that those tools that have been developed for
SPI should be packaged for general use.
- The Doxygen/LXR
documentation should be produced automatically as
part of the release procedure; also, cross-referencing between projects
should be possible.
- Savannah:
- It is
recommended to set up a user forum, for example in the form of a
mailing list similar to root-talk.
- Tools for bulk submission and
retrieval would ease migration and preparation of reports and statistics.
- The proliferation of
additional systems for bug and task tracking (Bugzilla,
ROOT bug DB) is a concern. Experiments need a coherent system
which eases cross-referencing and migration of bugs between projects.
It is recommended to converge on Savannah and to not dedicate CERN/LCG
resources for the maintenance of alternative systems.
- The procedures for selecting
and defining the lifetime of packages, platforms and compilers should be documented. This includes documenting the
corresponding support commitments. Not only the AF but also other LCG
areas should be involved in the decision making process.
- Build and distribution:
- Even though the choice of
build tools is no longer an issue, a clear statement of strategy is
required. This includes the clarification of the role of SCRAM.
- Package dependencies should be minimized by making the distinction between
build, test and runtime dependencies.
- The LIM (experiment
librarians) meeting should be used for
discussing and defining what different distributions are needed. The
needs of LCG deployment should be addressed as
well.
- The AA should
be represented in Linux and compiler certification discussions.
- QA:
- An impressive suite of tools
is now in place. Their adoption by the experiments is encouraged. Clear
QA procedures to be followed by projects should
be defined, as well as ways to encourage compliance.
- SPI should adopt a
coordinator’s role for the evaluation and selection of external tools,
which includes making recommendations to the AA and AF community. In
order to minimize duplications, this coordination role could
be extended to include non-QA tools like profilers or XML parsers.
- Even though training is not one
of SPI’s current responsibilities, it is felt
that it should become one. The very successful Python course should be continued.
- The progress since the last
review has been excellent. Most recommendations have
been implemented. POOL is deployed and
used in DC’s by three experiments; around 400TB of data has been stored.
- The impact of the merger of
SEAL and ROOT on POOL is of concern, since it will generate additional
workload that must be taken into account in the
planning.
- The documentation has been greatly improved but in the User’s Guide,
there are still missing, wrong/obsolete and garbled documentation items.
- Despite a great effort in bug
fixing, there is a small number of persistent
bugs. The release process should be streamlined
with the rest of the AA.
- Error handling and reporting
needs to be improved. In particular, error reporting must
be propagated to end users with clear indications of what
components failed.
- With regard to POOL
collections, a lack of clear requirements from the experiments is expressed. CMS is using POOL implicit collections
but other experiments may require new functionality in the future. In
order to anticipate the required efforts, deadlines for the submission of
new user requirements should be suggested.
- File Catalogues: POOL will have
to work with different FC back-ends as selected by sites and VO’s. These FC’s should implement the POOL API’s. In order to differentiate between POOL and
FC backend performance/problems, a reference benchmark for the FC’s should be defined.
- COOL: Experiments interested in
COOL are invited to commit more manpower in order
to assure the survival of the project. So far, two experiments have
expressed commitments to use COOL; CMS is considering its usage.
- In order to comply with
security aspects in POOL, end-to-end solutions should be
taken into account; POOL should not be the weakest point in the
chain. The impact of security on performance is of concern. Precise user
requirements are needed in order to define an
appropriate solution. It is suggested to check
user requirements and solutions developed in the Grid community and other
applications like PROOF.
- Major progress has been made in the SEAL area. Since the last review,
there has been a widespread adoption by the experiments.
- In terms of project
organization, all experiments welcome the proposed merger of ROOT and SEAL
activities. The experiments should set the schedule and priorities in
their role as stakeholders. LCG manpower should
concentrate on high priority items (like dictionary, Mathlib), and the AF
should supervise the process.
- The merger should preserve the
best of both projects. The architectural strengths of SEAL, like its
component model, should be preserved. It is not
sufficient to limit it to “adding missing features to ROOT”.
- A light-weight
packaging with minimized dependencies is considered crucial. Applications
should be able to select core components without having to take the entire
framework.
- Basic classes and components
should be decoupled where appropriate (e.g. removing inheritance from TObject in ROOT-CORE). Also due to the differences in
the plug-in architecture of SEAL and ROOT, proposed changes will need to be carefully measured against the impact on existing
experiment schemes.
- There is a broad agreement for
a common dictionary; more detailed planning will be
defined in a workshop in May. The integration of Mathlib is the
most advanced.
- The proposed schedule for SEAL
and ROOT migration, which aims for common and duplication-free libraries
in January 2006, is supported.
The role of CLHEP and possibilities for its
replacement were discussed. CLHEP is an external
component and its evolution is not controlled by the AA.
However, CLHEP is being used by GEANT4 and the
experiments. The replacement costs need to be evaluated,
and possible migration strategies need to be defined.
- The previous minutes (link)
were circulated and will be approved, if no
comments are made until 6/4/05.
- Next meeting (June 3, Agenda page):
- The next meeting will
be focused on the review of the Q1/05 LCG Status Report, which is
due by the end of April. In order to prepare the meeting, all SC2 members
should review the quarterly status report and come up with concerns and
questions, especially in the assigned godparent section. Questions and
concerns should be sent to the SC2 list prior to the meeting (by
Wednesday May 18), such that by the time of the meeting meaningful
answers can be prepared by the LCG project.
- A phone conference is proposed for
Wednesday May 18, 17:00h.
- Pere requests that since already
subject to an internal review and an SC2 focus meeting, the Applications
Area should be excluded from the next Status
Report review. SC2 agrees with this proposal.
- Service Challenges: An improved
plan for Service Challenges was presented (link)
and discussed. Not only requirements on capacity and throughput
have been defined, but also which of the major sites will be joining and
at what dates.
- SC2 is currently running. Its
goal is to perform sustained disk-to-disk (SRM) transfers to seven Tier-1
sites at an aggregate target rate of 500MB/s during 10 days. Excepting
hardware problems at CERN that have been looked after, the SC is running
very smoothly and is reaching peaks of 700-800 MB/s. FermiLab and FZK are
providing large capacities. With regard to SC1, a significant improvement has been achieved in terms of networking and sites
organization. Les points out that the WAN speed record in June last year
was around 6.4Gb/s. This is a very similar number
to the one achieved now, sending data from real file systems to real file
systems in steady rate.
- SC3 will include disk-to-tape
transfer tests from CERN/T0 to T1 sites and running experiment jobs. A number
of T2’s will be involved as well. From September on, a service part will be started and experiments will get involved by
carrying out tests, in order to validate their computing models. SC3 will
represent a big increase in complexity over SC2.
- Replying to a question by
Matthias, Les reports that a draft of the LCG TDR is
scheduled for April 11 and will contain a general LCG section, and
experiment-specific sections.
Discussion points
Should the SC2 interact more with the experiments and how?
- Even if some SC2 members belong
to experiments, the SC2 committee as such does not directly interact with
the experiments. The experiment computing coordinators were
invited only to the first SC2 meeting after the reorganization.
Matthias points out that the ALICE and LHCb sections are missing in the
LCG Q4/04 status report, which makes the follow-up difficult, in
particular for SC2 members not based at CERN. The SC2 would like to see
that the experiment contributions are not missing from the LCG status
reports. Based on this input, the SC2 godparents may contact experiments
with questions and requests for clarifications. Matthias will contact the
experiments coordinators in this regard.
How can the
SC2 help the reorganized Application Area?
- Pere considers it helpful that
there was a review already at the beginning of his mandate as AA manager.
This review provides him with valuable input and recommendations that he
can now discuss with the experiments inside the Architect’s Forum.
According to Pere, the AF is the right body for such discussions in
particular if a clear strategy has been put in
place. In case of items, for which no agreement is found
inside the AF, there is a defined escalation procedure to the PEB where
the experiment coordinators are represented. However, this escalation is rarely needed and should be avoided. An incentive
for collaboration within the AF is that LCG resources are common to all
experiments and need to be shared. Also, it was pointed out that LCG funded resources
should be devoted to LHC Computing related activities.
Is a forum needed for client - to - Grid Service provider communication?
- Should the AF
be extended
for dedicated meetings? In principle, ARDA was set up
with the idea of providing such a forum. Ways have to be
found how to better integrate ARDA into the main activities of the
experiments. Alternatively, the successful Baseline Services Working Group
(BSWG) might be continued as an ad-hoc working
group. However, the BSWG was formed with a set of
topics defined in advance, so there might be a mismatch in the expertise
of the WG members as the subjects evolve.
- On one hand, general-purpose
forums may tend to grow too much and the discussions may
become less focused and may not lead to decisions. On the other
hand, there is a limited representation of expertise in restricted forums
like the AF or BSWG. Pere suggests creating a software development
activity targeted to physics analysis and solving concrete problems. The
possibility of restructuring ARDA for this purpose is
discussed. It is suggested that ARDA could
move onto more focused products and away from independent and
experiment-specific activities with emphasis on testing. However, any
change to the ARDA work plan would need prior discussion and agreement
with the experiments.
- PROOF: A Program of Work needs
to be defined for PROOF, including a
specification of the required environment for significant testing. Within
the AA, PROOF would be best placed outside the
ROOT/SEAL CORE activities.
Are there
changes in the AA staffing estimates?
- Pere informs that the current manpower figures are essentially unchanged from Torre’s original planning. Pere will revisit the
planning but he does not expect that the bulk number will change. In manpower terms, the LCG contribution should be
considered a core around which the experiments contribute. However, manpower contributions below a given threshold (e.g.
below 20%) should not be counted, since they are insufficient for
productive work.