Speaker
Manuel Delfino Reznicek
(Universitat Autònoma de Barcelona (ES))
Description
Several scientific fields, including Astrophysics, Astroparticle Physics, Cosmology, Nuclear and Particle Physics, and Research with Photons, are estimating that by the 2020 decade they will require data handling systems with data volumes approaching the Zettabyte distributed amongst as many as 1018 individually addressable data objects (Zettabyte-Exascale systems). It may be convenient or necessary to deploy such systems using multiple physical sites. This paper describes the findings of a working group composed of experts from several large European scientific data centres on architectures and methodologies that should be studied by building proof-of-concept systems, in order to prepare the way for building reliable and economic Zettabyte-Exascale systems. Key ideas emerging from the study are: the introduction of a global Storage Virtualization Layer which is logically separated from the individual storage sites; the need for maximal simplification and automation in the deployment of the physical sites; the need to present the user with an integrated view of their custom metadata and technical metadata (such as the last time an object was accessed, etc.); the need to apply modern efficient techniques to handle the large metadata volumes (e.g. Petabytes) that will be involved; and the challenges generated by the very large rate of technical metadata updates. It also addresses the challenges associated with the need to preserve scientific data for many decades. The paper is presented in the spirit of sharing the findings with both the user communities and data centre experts, in order to receive feedback and generate interest in starting prototyping work on the Zettabyte-Exascale challenges.
Primary author
Manuel Delfino Reznicek
(Universitat Autònoma de Barcelona (ES))
Co-authors
Andreas Heiss
(KIT)
Andrew Sansum
(RAL)
Benoit Delaunay
(IN2P3)
Brian Matthews
(RAL)
Christian Neissner
(IFAE)
David Corney
(RAL)
German Cancio
(CERN)
Giovanni Lamanna
(CNRS)
Ian Bird
(CERN)
Ian Peter Collier
(RAL)
Jamie Shiers
(CERN)
Jean-Yves Nief
(IN2P3)
Jose Flix Molina
(CIEMAT)
Luca dell'Agnello
(INFN)
Marcello Maggi
(INFN)
Maria Del Carmen Porto Fernandez
(CIEMAT)
Markus Schulz
(CERN)
Martin Gasthuber
(DESY)
Patrick Fuhrmann
(DESY)
Pierre-Etienne Macchi
(IN2P3)
Tommaso Boccali
(INFN)
Vanessa Acin Portella
(IFAE)
Volker Guelzow
(DESY)