pre-GDB on Grid Storage Service   11 November 2008
==================================================

Minutes by Flavia Donno

Attendees: about 50


- Introduction and experiment requirements
------------------------------------------
In her presentation F. Donno introduced the event listing the goals and summarized the experiment requirements that are still missing:
- Control on spaces access and operations
- Control on storage resources in general (tape access)
- Full ACLs on the namespace
- Selection of storage spaces (via tokens or directories associated to pools)
- Pinning/Unpinning capabilities based on FQAN
- Space purging capabilities
- User/group quota
- Optimized staging operations
- File access libraries (available for a diversity of compilers)
- Statistics on storage usage

There was a discussion about the list itself. Experiment stressed that the requirements are not new and that nothing else has been added to the original requests. The audience agreed that a break down of requirements per VO is needed (it is indeed available). Developers stressed that a prioritization of development is absolutely needed in order to focus the available energies in the right direction. Furthermore, for the analysis scenarios it is very useful to have a typical user application that developers can run in order to see if the requirements are satisfied by the storage services.

Issue to be discussed at higher level: Provide developers with a prioritization of needed features/requirements.

- FTS: an update
----------------
A. Frohner gave an update on the status and plans of FTS. Among the issues that were mentioned by the storage developers, sites and experiments are:
- Issuing srmls -l requests
	A discussion about the srmls -l operation and the performance of dCache pnfs followed. The srmls -l operation is in general an expensive operation. It should be issued with care and only if necessary. In particular, srmls on big directories should be avoided. Application developers and users are discouraged to perform an srmls on directories whenever possible. srmls on a list of files is instead OK. In particular, FTS used to issue srmls -l in many cases: on the source to find out if the file exists and is online; on the target to find out if the destination directory and the file itself exist. FTS has never issued srmls requests for the *contents* of a directory. It always uses numOfLevels = 0. A savannah bug was already open (https://savannah.cern.ch/bugs/?39992). The behaviour of FTS is fixed with FTS 2.2. Only a simple srmls is used for existence checking. Also dCache has been optimized to make the srmls operation as light as possible.
- Support for checksum
	It is very important to detect file corruption as soon as possible, before replicas of the file are made (and the corruption spreads). One way to fullfil this requirement is through the support of a checksum. FTS should allow clients to specify a checksum type and value (https://savannah.cern.ch/bugs/?43825). Checksum can be specified by the client, or can be retrieved from the source through an srmls or can be retrieved through the CKSM command of the gridftp server. Please, check the savannah #43825 for more details. Also storage services should return checksum information through the srm layer.
- FTS logs and configuration parameters should be made available for debugging purposes
	Two savannah bugs have been open (https://savannah.cern.ch/bugs/?43819 and https://savannah.cern.ch/bugs/?43837).
- FTS should pick up blocked transfers since those can clog the channel
	The cloud feature of FTS should be investigated for this. This problem concerns both T2 to T1 and T1 to T1 transfers. FTS 2.1 introduces as well a split between control and data transfer operations. This should alleviate the problem.
- Multiple FTS channels can stress an SRM
	Also for this the cloud feature of FTS should be investigated. Furthermore, after the pre-GDB the FTS developers met the storage developers in order to come up with a common strategy that would allow FTS to back off in the case of a busy SRM server. A solution that can be adopted by all storage and client developers is on its way.
- In general there is a difficulty in configuring correctly the channels. Sites need advice.
	Experiments do not see their share of the channels being actually respected. How can usage be optimized ?
- Sites need advice also for hardware and general configuration recommendations (CPU load, memory, etc.)

Issue to be discussed at higher level: Sites need coordination for hardware acquisition, configuration, management and operation of storage and transfer services.

- GFAL and lcg_util
-------------------
R. Mollon presented the status and plans for GFAL and lcg_util. The discussion covered the following points:
- There is no support for checksum in gfal/lcg_util. There are no plans for it.
- lcg-_util/gfal retry on the next SE (the first selected is within the same domain and then next available) if the TURL is not returned by the storage service. Sometimes, storage services return a TURL even if the file is unavailable or lost. This should be improved.
Also, there are sites with different domains within a site. This can be a problem for lcg_util/GFAL. The SE load should also be taken into account when retrying on a different SE.
- In general, there are no retries in lcg_util/gfal. This can change. Advice is needed from the storage developers for the best strategy to follow in case of a retry.

A discussion started about files unavailable or lost. In the past, both dCache and CASTOR exposed a bug for which unavailble files were uncorrectly reported as either NEARLINE (dCache) or LOST (Castor). Now this is fixed.
It was said that sites should provide experiments with a list of lost or unavailable (hardware failure) files so that experiments could decide what to do with their catalogues. This is normally a manual and expensive operation.
Furthermore, the SRM should correctly report lost or unavailable files so that an automatic recovery procedure can be put in place.

- DPM status and plans
----------------------
D. Smith presented the status and plans for DPM.
- DPM will provide NFS v4.1 support only if requested by users.
- The DPM developers are investigating quota. There are discussions on how to handle specific issues: user overquota, accounting, etc. Quota support in DPM will not come soon. There is no timescale.
- File access for local analysis will be granted at the moment through GFAL with rfio, gsidcap, root.
- DPM developers will investigate on Nagios type of alarms.
- checksum is supported by the DPM gridftp server.

- StoRM status and plans
------------------------
L. Magnoni from CNAF presented the status and plans for StoRM. The new release (1.4) comes with a gridftp load balancer with several policies. Furthermore, StoRM v1.4 enables access control on Storage Area through a new and more efficient mechanism (new authorization source).
M. David from LIP gave a presentation on their experience migrating from dCache to StoRM (Lustre). The experience was very positive.

Issue to be discussed at higher level: Sites need coordination and assistance translating experiment requirements into storage specific configurations.

- dCache status and plans
-------------------------
P. Fuhrmann presented the status and plans for dCache. Discussions immediately focused on the slow performace of PNFS. It is clear that ls operations on large directories can clearly penalize performance. That's why FNAL and CMS are now using directories with no more than 1000 entries. srmRm used to be an expensive operation since it forced an internal id to path translation. This is no longer the case. It is Ok to list a set of files, even if large.
In general, sites are advised to switch to the fast PNFS and to Postgres 8.3. This version of PNFS is surely more performant (double as the old one ?). It is not clear where the new threshold is with the fast PNFS. Sites should get prepared to switch to Chimera. The developers team will provide a tool that allow sites to switch to Chimera in less than a day. Chimera offers a long set of features:
- improved lock handling in DB
- much more complex queries will be allowed since a good portion of the DB is exposed to users.
- Access to data and metadata is allowed, which enables for quotas
- default ACLs can be inherited.

G. Behrman from NDGF gave an update on a few dCache components such as the dCache_check module, the postgres WAL for DB replication, the dCache migration and copy manager and good practices for DB housekeeping, etc. Please, check the slides. Timur from FNAL stressed that a dump and restore is more effective and fast than running a vacuum full on the DB. A recommendation about using memory for the disk pools was also given. From Gerd:"it is better to leave the memory to the OS to allow it to cache the data.
The pool does however need some memory to maintain the meta data of the files stored on the pool. On pools with a large number of files, the default may be too low. In that case you should increase the -Xmx setting in dCacheSetup (don't touch the MaxDirectMemorySize).
I should also add that the the alternative meta data store using a Berkeley DB uses a lot less memory that the default one. We use the Berkeley DB store at NDGF, but I understand that some sites are reluctant to put this data in a database, as it may be hard to predict what happens if the DB becomes corrupted."
The migration and copy manager is ignorant about space reservation and about the pin Manager. It works between "new" 1.9.1 pools.

It is strongly recommended to sites to test their configuration after an upgrade before deployment in production.
dCache has improved logging: All log file entries of a particular request will be tagged with Unique ID.

R. Trompert and R. Tafirout gave a presentation on the dCache setup at their site.
ATLAS was concerned about the staging performance at SARA (staging process not working at this site). The problem has been recognized as due to an internal buffer used for optimization reasons being too small. Furthermore, the developers think they can help SARA tune better their resources and configuration for staging. NDGF will focus on SARA. In particular, a quick fix will be made in dCache so that dCache can pick up on staged files before the garbage collector acts on them again, removing them from disk.

It was stressed that it would be nice to encourage collaboration between Tier-1s so that they can help themselves.

Events like this one are very useful. The next dCache event focused on Tier-1s will be organized in January 2009.

Developers can intervene to help Tier-1 sites whenever needed.

Issue to be discussed at higher level: A coordinated effort is needed in order to test Chimera.

Issue to be discussed at higher level: There is a need for a coordinated WLCG policy, for instance to act or report problems (see the file lost case: should a site admin automatically remove lost files from the name space ? Should different actions be put in place for different VO ? This last approach is very difficult to handle for site admins).