WLCG Archival Storage Group

31/S-027 (CERN)



Show room on map

WLCG Archival Storage Group

  • Introduction (Oliver)
    • Scope of group
  • Optimal exploitation of archival resources (Oliver)
  • Archive operator knowledge sharing group
    • Summary of HEP sites configuration (by Vladimir Bahyl)
    • BNL view on the metrics (by David Yu)
  • Next steps
    • HEPiX


Andreas Petzold
Bo Jayatilaka
Christoph Paus
Christoph Wissing
David Yu
David Mason
Dorin Lobontu
Esther Acción
Erik Mattias Wadenstein
Gene Oleynik
Helge Meinhard
Jose Flis Molina
Julia Andreeva
Jurrian @Surfsara
Laurent Duflot
Martin Barisitis
Martin Gasthuber
Niklas Edmundosson
Onno @SUrfsara
Paul Millar
Pierre Emmanuele Brinette
Robert Verkek
Simon Liu
Tim Folkes
Vanessa @Pic
Vincent Garonne
Xin Zhao
Oliver Keeble
Mario Lassnig
Vladimir Bahyl
Andrea Manzi


* Name and Scope

No objections to combining the two "triggers" within the same group.
  optimal use of tape systems
  knowledge-sharing forum for operators

Focus is tape systems for archival use.
This will expand if the group wishes to include
  tape for non-archival use
  non-tape for archival use
We are flexible.

Discussion on name and scope.  No alternative name received much approval, so keep the existing one and explain the scope in the group twiki.


* Optimal exploitation of archival resources (Oliver)

Proposal to collate "best practise for experiments" from each archival site. This should include all necessary advice to allow users (experiments, FTS, ...) to optimally exploit the storage system.

Decision - do one iteration of this and schedule review at next meeting.

Action - Oliver to create necessary space and send invitations to contribute.

FTS and monitoring -
  FTS represents a potential common client which can produce comparable metrics between experiments and sites.
  No further feedback, but we need to check how to access FTS staging analysis in the absence of a CERN account.


* Summary of HEP sites configuration (Vlado)

Overview of current situation, discussion on what the group could report.

Can this data be refreshed e.g. once per month?

For the summaries, are we considering HEP data or LHC data?

What does capacity mean?
  total media capacity, also if empty.
  pledge is OK too.

NB - the WLCG resource reporting group is interested in getting used capacity for tape systems, to support experiment operations and storage accounting. This can be discussed at a future meeting.

Action - Vlado to circulate URL of the twiki we will build for the group.

Action - Vlado to propose a platform for collection of metrics


* BNL view on the metrics (by David Yu)

Sharing "remount" information. Can we share (per VO)
number_of_unique_vols_mounted_in_a_day / total_mounts

Are we reporting binary or decimal prefixes?
  Decision - decimal.

Utility of sharing these metrics
  - see if changes we make are improving things
  - help sites understand their own performance by comparing with others

What about deleted data?
  Some sites can remove deleted data volume from totals
    at CERN this is hard
  should bytes used include this or not??
  to be further clarified.

How should data be aggregated in distributed sites?
  Average across whole site?


* Next steps

  Next meeting to be scheduled when the first round of information gathering is complete
  Regular reporting to HEPiX encouraged


There are minutes attached to this event. Show them.
The agenda of this meeting is empty