Speaker
Luca dell'Agnello
(INFN-CNAF)
Description
Long-term preservation of experimental data (intended as both raw and derived formats) is one of the emerging requirements coming from scientific collaborations. Within the High Energy Physics community the Data Preservation in High Energy Physics (DPHEP) group coordinates this effort.
CNAF is not only one of the Tier-1s for the LHC experiments, it is also a computing center providing computing and storage resources to many other HEP and non-HEP scientific collaborations, including the CDF experiment. After the end of data taking in 2011, CDF is now facing the challenge to both preserve the large amount of data produced during several years of data taking and to retain the ability to access and reuse it in the future.
CNAF is heavily involved in the CDF Data Preservation activities, in collaboration with the FNAL computing sector. At the moment about 5 PB of data (raw data and analysis-level “ntuples”) are being copied from FNAL to the CNAF tape library and the framework to subsequently access the data is being set up. In parallel to the data access system, a data analysis framework is being developed which allows to run the complete CDF analysis chain in the long term future, from raw data reprocessing to analysis-level “ntuple” production. In this contribution we illustrate the technical solutions we put in place to address the issues encountered as we proceeded in this activity.
Author
Luca dell'Agnello
(INFN-CNAF)