GridPP Technical Meeting - Future of DPM
Virtual Only
Weekly meeting slot for technical topics. We will try and focus on one topic per meeting. We will announce at the Tuesday Ops meeting if this meeting is going ahead and if so the topic to be discussed.
Andy McNab asked if there was a reason for this discussion now?
Sam noted that this isn't so much a sudden change as an ongoing issue. Fabrizio has mentioned several times in recent years that DPM "effort" has to be argued for / is really mostly supported because he can point at the number of sites using it. There have been 2 significant architecture changes in DPM in the last 10 years (as different project leads took over), and both have been disruptive. There has been some question as to if DPM dev effort will track changes needed from DOMA; and the current architecture changes actually remove support for some configurations we have in the UK. (And, DPM itself is now lagging state-of-the-art in terms of its storage backend by some years.) Glasgow are also in the process of moving machine room and have extra capital to test Ceph.
Alessandra add that it was also due to the recent HOW19 workshop and in particular some comments from Simone:
https://indico.cern.ch/event/759388/contributions/3322830/attachments/1815462/2968778/StorageEvolutionJLAB.pdf
Lukasz pointed out the Hadoop worked going on in the US:
https://opensciencegrid.org/docs/data/hadoop-overview/
Questions on Alastair talk:
Simon George asked: How easy would this be for a site to setup?
Alastair replied that it would probably only take a few days to setup a Ceph cluster and if you went for a simple XrootD/GridFTP setup it could also be done in a few days assuming support from the Tier-1 (as it isn’t well documented). Much more complicated setups are possible.
Alessandra made some points about the "production quality" of the 4 options presented, and how quickly they could be adopted. (And, for example the CephFS+GridFTP option uses GridFTP which is being retired)
Marks talk:
EOS has worked great with a single replica and using ZFS for data resilience. Mark has not got the Erasure Coding to work and lost a significant amount of data before giving up.
Alessandra is going to setup an RSE to try and use Mark’s Cache deployment.
Future Technical meetings on:
Setting up Ceph (Glasgow) Second half of May
XCache (BHAM) start of June
DOME (Brunel, Lancaster, Manchester) Mid June.