Brief notes from CCRC08 Storage Solutions Working Group of 11 Feb 32008
Agenda with attached material at :
http://indico.cern.ch/conferenceDisplay.py?confId=28791
Present: J.Shiers (chair), H.Renshall (notes), F.Donno, J-P.Baud,
M.Branco, A.Pace
Phone: P.Fuhrmann, G.Oleynik, M.Ernst, D.Petravic, T.Perelmutov
Addendum to WLCG SRM v2.2 MoU:
JS explained his intention that this be a lightweight addendum that would
detail what has to be changed, with this being based on operational
experience, and that this would then be checked by the technical people.
Implementation of the agreed changes would then start.
GO pointed out that Mat Crawford suggests a written report on how the eventual
proposals in the MoU addendum are arrived at and JS agreed with this, especially
given the recent budget issues at FNAL.
SRM v2.2 key concepts, methods and behaviour:
FD presented her slides (attached to the agenda). The idea is to
establish short, medium and long term goals with associated dates.
The immediate short term goal is to select tape sets by means of tokens
or directory paths. She thought dcache was in good shape for this after
patch level 5 which allows to pass tokens to the migrator. For Castor
she did not know the status. PF said that starting now he could not
say what is short or long-term and would prefer a list of goals which
could then be sorted and prioritised. GO agreed with this adding that
FNAL budget problems made any other approach difficult. PF asked, on slide
4, what was the issue with srmGetSpaceMetaData. FD replied that
it was important for srmGetSpaceMetaData to return correct sizes.
GO then said he would like to understand the process of getting experiment
input on the extensions to the key concepts, methods and behaviour and
their priorities (slides 3 and 4 of the Flavia presentation).
JS reminded that the experiments are on the mailing
list of this series of meetings but said he was going to expose this
to the MB. PF said he had been hoping that more experiment technical
people would be at this and the next meeting. Since M.Branco was there it
was agreed to start with comments from ATLAS which he (MB) agreed to have
ready in a weeks time. JS thought we must bear in mind what can
realistically delivered before May. PF said that for dcache he thought
the crucial things were security and to honour space tokens given
in srmPrepareToGet and srmBringOnline. He would like to have the
experiment positions for the next meeting.
There was then a more technical discussion triggered by PF saying that
we must have a consistent interpretation of any extensions across the
different mass storage systems. TM said that instead of honouring space
tokens for a recall dcache could use other means to characterise the pool
to be used for the recall. FD reminded that today dcache can not
distinguish between generic and production users. MB said that
there will be restores of files that need to go to alternative pools e.g.
one time for reprocessing, another for export, and this can only be done
by the restore accepting a space token. MB reminded that he had sent
a short email outlining the ATLAS use case to the list of this meeting
(the subject was 'tokens on "more-than-write"' and for completeness I have
added the text after these minutes. HR). JS thought the only major long term
issue is the behaviour of file restore from tape and PF could
not predict how long changes to dcache in this area might take. MB said
that for Atlas the most urgent goal is that of protecting spaces. FD said
that in her list of proposed short term goals the first 2 (selecting tape
sets and making the space token orthogonal to the path) are low priority
or done while the second two (protecting spaces from generic users and
making implementations fully voms aware) are more important to the
experiments and not yet done, the voms requirement being really a
part of the implementation of space protection.
In summary JS said we now need replies from the experiments on the work
list and their priorities and we want them in time for discussion in the
week of 25 Feb by when we will have had a little more production experience.
It is possible we end up with more than one MoU addendum. He will send
a short email to the list of this meeting before the MB asking for
experiment feedback by 22nd Feb. We will decide the date of the next
meeting later.
========================================================================
email from M.Branco to the SSWG list on 11 Feb
Hi,
on the discussion of use cases for tokens on "more-than-write" I mentioned during the meeting...
small excerpt but I believe addresses the point Timur was referring to..
cheers
Our understanding of dCache LAN and WAN areas and staging
Data to be re-processed is initially written to the Tier-1 using a space token. The space token name is e.g. _ATLASDATATAPE (T1D0)_. The relevant space tokens that involve staging are of course those of type T1D0. The space token itself is also appended to the physical path by ATLAS: srm://example.site/atlasdatatape/. It appears this has to be done for dCache sites - see below - but will also be done for CASTOR and DPM for the sake of consistency of naming conventions.
When reprocessing starts, ATLAS will request sets of files called datasets. These datasets will be staged automatically by dCache to a default pool, which should be of type LAN. ATLAS will specify the pool the files were initially written to, but this is ignored by dCache. However CASTOR will make use of the attribute though, so ATLAS will pass it always. As there is the risk of staging files into the default pool at the same time, dCache sites themselves may make use of the path (e.g. /atlasdatatape/ ) to create different LAN pools where staged files go to_._
The output of a reprocessing job must be stored at the same site, as well as be sent to other Tier-1s. The job will use the space token _ATLASDATADISKTAPE (T1D1)_ to write the file to the correct area. As our space tokens are of type WAN, the other Tier-1s will be able to fetch the file. Future reprocessing jobs proceed as before (staging from what had been written to a WAN area back to the LAN default pool, optionally making use of the file path).
We assume jobs at the site should be able to read data written to the WAN area directly. That is, a file written to a space token of type T1D0 should be copied over directly to the WN if it is still resident on the disk buffer of the tape area.
While the workflow above fulfills our reprocessing requirements, it may cause problems to transfers between Tier-1s. It is always possible that a dataset is required by another Tier-1 and is currently residing on tape. Therefore, using only SRM commands, how can the dCache be configured so that, in this case, files are not staged to the default pool (which is LAN and not accessible from the outside) but instead to a WAN accessible area? Manual configurations or limiting the deployment model of our data management system (e.g. specific machines with sets of IPs making requests) has consequences in our system, as our data management infrastructure is designed to serve many sites from the same machines in parallel.
_For FDR1 we survive by accepting an extra disk copy from a LAN to a WAN disk but for performance reasons this is not prefered as the final solution._
--
Miguel Branco, CERN - ATLAS Computing Group
MSN: msbranco@hotmail.com
+41 22 76 71268 - http://cern.ch/mbranco/