Claudio Vuerli
1. Application context and scientific goals
An accurate measure of the whole sky emission in the frequencies of the microwave
spectrum and in particular of the Cosmic Microwave Background (CMB) anisotropies can
have crucial implications for the whole Astrophysical community as it permits to
determine a number of fundamental quantities that characterize our Universe, its
origin and evolution.
The ESA Planck mission is aimed to map the microwave sky performing at least two
complete sky surveys with an unprecedented combination of sky and frequency
coverage, accuracy, stability and sensitivity.
The satellite will be launched in 2007 carrying a payload composed of a number of
microwave and sub-millimetre detectors which are grouped into a high frequency
instrument (HFI) and a low frequency instrument (LFI) covering frequency channels
ranging from 30 up to 900 GHz.
The instruments are built by two international Consortia which are also in charge of
the related Data Processing Centres (DPCs). The LFI DPC is located in Trieste, the
HFI DPC is distributed between Paris and Cambridge. In both Consortia, participation
in the development of the data processing software to be included in the DPCs is
geographically distributed throughout the participating Institutions. The overall
Planck community is composed of over 400 scientists and engineers working in about
50 institutes spread in 15 countries, mainly in Europe but including also Canada
and the United States. A fraction of this community, the one possibly involved with
Grid activities, can be defined as the Planck Virtual Organisation (VO).
During the whole of the Planck mission (Design, Development, Operations and Post-
operations), it is necessary to deal with aspects related to information management,
which pertain to a variety of activities concerning the whole project, ranging from
instrument information (technical characteristics, reports, configuration control
documents, drawings, public communications, etc.), to the proper organisation of the
processing tasks, to the analysis of the impact on science implied by specific
technical choices. For this purpose, an Integrated Data and Information System
(IDIS) is being developed to allow proper intra-Consortium and inter-Consortia
information exchange.
Within the Planck community the term "simulation" refers to the production of data
resembling the output of the Planck instruments. There are two main purposes in
developing simulation activities:
- during ESA Phase A and instrument Phases A and B, simulations have been used to
help finalising the design of the Planck satellite’s P/L and Instruments hardware;
- on a longer time-scale (up to launch), simulated data will be used mainly to help
develop the software of the data processing pipeline DPCs, by allowing the testing
of algorithms needed to solve the critical reduction problems, and by evaluating the
impact of systematic effects on the scientific results of the mission, before real
data are obtained.
The output of the simulation activity is Time-Ordered Information (TOI), i.e. a set
of time series representing the measurements of the scientific detectors, or the
value of specific house-keeping parameters, in one of the Planck instruments. TOI
related to scientific measurements are often referred to as Time-Ordered Data (TOD).
Common HFI-LFI tools have been built and integrated in order to build a pipeline
system aimed at producing simulated data structures. These tools can be decomposed
in several stages, including ingestion of astrophysical templates, mission
simulator, S/C simulator, telescope simulator, electronics and on-board processing
simulator. Other modules, such as the cooling system model, the instruments
simulators and the TM packaging simulator, are instrument-dependent. It should be
noted that the engine integrating all the tools has to be flexible enough in order
to produce the different needed forms or formats of data.
The Planck Consortia participate to this joint simulations effort to the best of
their scientific and instrumental knowledge, providing specific modules for the
simulations pipeline. For each Consortium the code allowing to produce maps and time-
ordered sequences out of simulated microwave skies is the one jointly produced for
both Consortia: data simulated by HFI and LFI are therefore coherent and can be
properly merged. To the output data of the common code (timelines) an additional LFI-
specific code is applied to simulate on-board quantisation and packetisation, in
order to produce streams of LFI TM packets.
The goal of this application is the porting of the whole simulation software of the
Planck mission on the EGEE Grid infrastructure.
2. The grid added-value
Planck simulations are highly computing demanding and produce a huge amount of data.
Such resources cannot be usually afforded by a single research institute, both in
terms of computing power and data storage space. Our application therefore
represents the typical case where the federation of resources coming from different
providers can play a crucial role to tackle the shortage of resources within single
institutions. Planck simulations take great advantage from this as a remarkable
number of resources are available at institutions collaborating in the Planck VO, so
they can be profitably invested to get additional resources shared on the Grid. The
first simulation tests have been carried out on the INFN production Grid in the
framework of the GRID.IT project. A complete simulation for the Planck/LFI
instrument has been run on a single, dual-CPU, workstation and on Grid involving 22
nodes, one for each detector of the LFI instrument. The gain obtained by using the
Grid was of ~15 times.
Another added value coming from the Grid is its authentication/authorization
mechanism. Planck code as well as data are not public-domain; we need to protect the
software copyright; data moreover are property of the Planck P.I. mission. The setup
of a Planck VO makes possible to easily monitor and control accesses to both
software and data without the need of arranging tools already available in Grid.
Last but not least a federation of users within a VO fosters the scientific
collaboration, an added value of key importance in Planck given that users who
collaborates to the mission are spread all over Europe and United States.
3. Experiences and results achieved on EGEE
Due to some initial issues in the start up process of the Planck VO, we were not
able to fully exploit the big amount of potential resources available for our
application so far. The Planck VO has proved to be quite difficult to manage; the
start up process, in particular, has been slowed down by some difficulties in the
interactions between the local Planck VO managers and the respective ROCs. To
overcome these issues and make the Planck VO fully operative in a short time on-site
visits to Planck VO sites are foreseen in order to train local managers in setting
up and maintaining the Planck VO node and even local potential users to foster the
usage of the Grid technology for the Planck application needs.
4. Key issues for the promotion of the GRID technology
On the basis of our experience with the astrophysical community a special effort is
requested to spread the Grid technology and make potential users fully aware of the
advantages in using it. User tutorials can be extremely helpful to achieve this
goal. Even the preparation of a suite of Grid oriented tools is of key importance
like Grid portals and Grid Graphical User Interfaces to make users able to interact
with the Grid in an easy and transparent way and to hide some complexities of the
underlying technology.
