DPHEP Topical Workshop on "Full Costs of Curation"

31/3-004 - IT Amphitheatre (CERN)

31/3-004 - IT Amphitheatre


The purpose of this workshop is to discuss the full "costs of curation" of HEP data, e.g. that from the LHC, over a period of several decades (say 1, 2, or 3).

To look that far into the future, the past can be used as a "guide" as to the scale of the changes that might occur.

25 years ago, the LEP collider was about to start operation - which was shortly followed by an era of significant change in offline computing that continues to this day.

30 years ago, the Z and W had only recently been observed, Wylbur and MVS were still the main central computing services at CERN and central interactive services (VM/CMS, VAX/VMS) were only just making their debut.

The "programming language of choice" was Fortran - albeit often with extensions at the language level (e.g. VAX Fortran) or through libraries (e.g. Hydra, ZBOOK, ZEBRA etc.)

(The default terminal settings - 24 x 80 - are a legacy from this era.)

Storage capacity was minimal (the VXCERN service, originally based on a VAX 8600 or "Venus", had 6 disks of less than 500MB capacity each: one for each LEP experiment, one for the "system" and one for scratch space).

Today, mobile phones have internal storage of several GB, GHz multi-core processors and are pseudo-disposable devices.

What changes will take place in the future?

This is of course unknown, but we do know that there are clear scientific reasons for keeping the data from today's current experiments - as well as those from the recent past - fully usable until well into the future.

This workshop is about establishing the costs of such an exercise.

Having established these costs, according to a number of possible scenarios (e.g. "best case", "worst case"), this information will then be used for resource and budget planning.

  • Monday, January 13
    • 10:00 AM 1:00 PM
      Lessons from the past - 1
      • 10:00 AM
        Introduction: Purpose and Outcome(s) 20m
        Speaker: Dr Jamie Shiers (CERN)
      • 10:20 AM
        The JADE Ressurrection 20m
        Speaker: Stefan Kluth (Max-Planck-Institut fuer Physik (Werner-Heisenberg-Institut) (D)
      • 10:40 AM
        The LEP Experience 30m
        Speakers: Marcello Maggi (Universita e INFN (IT)) , Matthias Schroeder (CERN) , Olof Barring (CERN) , Dr Ulrich Schwickerath (CERN)
      • 11:10 AM
        Coffee 15m
      • 11:25 AM
        The HERA Inheritence 30m
        Speakers: Cristinel Diaconu (Centre National de la Recherche Scientifique (FR)) , David Michael South (Deutsches Elektronen-Synchrotron (DE))
      • 11:55 AM
        The Tevatron Tribulations 30m
        Speakers: Dr Gene Oleynik (Fermilab) , Silvia Amerio (Universita e INFN (IT))
      • 12:25 PM
        Discussion 15m
      • 12:40 PM
        The costs in digital curation – activities and approaches to cost modeling in the 4C project 20m
        Speaker: Anders Bo NIELSEN (Danish National Archives)
    • 1:00 PM 2:00 PM
      Lunch 1h
    • 2:00 PM 5:00 PM
      Lessons from the past - 2
      • 2:00 PM
        The Objectivity (and Oracle?) Migration(s) 30m
        Speakers: Dr Andrea Valassi (CERN) , Andrew Branson (University of the West of England (GB))
      • 2:30 PM
        Discussion 15m
      • 2:45 PM
        Adapting to the grid 30m
        Speaker: Tommaso Boccali (Sezione di Pisa (IT))
      • 3:15 PM
        Discussion 15m
      • 3:30 PM
        Coffee break 15m
      • 3:45 PM
        From SPIRES to INSPIRE 30m
        Speakers: Dr Salvatore Mele (CERN) , Sunje Dallmeier-Tiessen (Humboldt-Universitaet zu Berlin (DE)) , Tim Smith (CERN)
      • 4:15 PM
        Discussion 15m
  • Tuesday, January 14
    • 10:00 AM 1:00 PM
      Planning the future
      • 10:00 AM
        Bit preservation: the 10, 20 and 30 year outlook 30m
        Speakers: Dmitry Ozerov (DESY) , German Cancio Melia (CERN)
        HEPiX bit preservation Working Group WWW
        XLS sheet
      • 10:30 AM
        Discussion 15m
      • 10:45 AM
        Coffee Break 15m
      • 11:00 AM
        New architectures: multi-core, ARM, etc 30m
        General discussion with contributions as available.
      • 11:30 AM
        Discussion 15m
      • 11:45 AM
        Migration vs Emulation (see subcontributions) 1h
        Speakers: Gerardo Ganis (CERN) , Jakob Blomer (CERN) , John Harvey (CERN - PH/SFT) , Dr Pere Mato Vila (CERN) , Predrag Buncic (CERN)
        • An Approach to Emulation Exploiting Virtualisation 20m
        • CernVM[FS] technology for Emulation 20m
        • Discussion on Emulation / Migration costs and implications 20m
      • 12:45 PM
        Discussion 15m
    • 1:00 PM 2:00 PM
      Lunch 1h
    • 2:00 PM 5:00 PM
      Predicting the future
      • 2:00 PM
        Analysis 1h
        Speaker: All
      • 3:00 PM
        Summary and end of Thematic Workshop 30m
        • Brief conclusions of the workshop
        • Next steps, e.g. ICFA report, DPHEP paper, medium term plan updates and Resource Review Board input
        • Discussion on non-CERN experiments
        • Next events
        • Close
        Speaker: Dr Jamie Shiers (CERN)
      • 3:30 PM
        Additional Breakout on CCEX 1h