Objectives
The primary objectives continue to be the sharing of experiences and emerging issues regarding the long-term preservation, re-use and sharing of scientific (and associated) data.
The update of the OAIS reference model (ISO 14721), the various Certification methodologies, requirements from Funding Agencies (FAs) for (FAIR) Data Management Plans, together with technological changes / challenges are all relevant discussion points.
Of particular interest are experience and position papers from Large Scale Multi-National projects but all papers / posters that address the above issues and / or the traditional topics of PV are welcome.
Given that for many disciplines, e.g. those that make "observations" (by definition unique and un-repeatable), the very (very) long term is of interest. In addition, the possibility of finding synergies, even across seemingly disparate disciplines, and which could eventually result in significant cost savings, is clearly of importance.
These issues are reflected in the sessions described below.
With respect to previous conferences in this series, and whilst maintaining the overall thrust, additional goals for PV2020 include:
- Attracting more scientific communities
- Broadening information exchange, sharing of experiences, tools and even services
- Keeping in step with (or ahead of) funding agencies / policy makers in their push for Long Term Data Preservation and Open Data
Conference Sessions
The following sessions are foreseen for PV2023.
Session 1: Ensuring long-term data and knowledge preservation (the "P" in PV);
Session 2: Adding value to data and facilitation of data use (the "V" in PV);
Session 3: Short - medium term issues related to policy, technology, guidelines, FAIR / TRUST principles, certification;
Session 4: (Very) long term issues.
A new session (for the PV series) will be a set of "lightning talks" that will have a light-weight review process and for which the call for mini-abstracts will be sent to registered participants after the formal review process of the full abstracts for posters and papers has completed.
Session 1: Ensuring long-term data and knowledge preservation (the "P" in PV);
This session is intended to cover the best practises for the long-term preservation of the data and other results associated with research across the preservation lifecycle, from the submission of data packages for preservation, to the access of data products. This includes the organisational structures, policies and standards adopted by data centres and archives to assure cost-effective preservation, together with risk management, uncertainty quantification, quality assessment and the evaluation of preservation capabilities.
Novel architectures and tools used to realise different preservation strategies, as well as standards, tools and languages to capture the preservation context, including the preservation of data formats, the use of identifiers, metadata, semantics, data provenance, quality and uncertainty, are all highly relevant.
Topics for this session include:
- Architectures and tools for curation and preservation
- Standards for preservation, access and exploitation; including uncertainty and risk
- Policy, exploitation and preservation strategies
- Risk assessment and appraisal of data value
Session 2: Adding value to data and facilitation of data use (the "V" in PV);
The intended focus of this session are activities that add value to archived data, facilitate their (re-)use or produce novel data services. Data archivists often focus most of their energy on creating well-formed, well-documented archives with the expectation that they will be available for the next 50 to 100 years. However, archived data are meaningless if they cannot be easily retrieved, understood, and used. As a result we would like to invite submissions from projects or archives who rising to the challenge of enhancing data in order to facilitate exploitation of data assets.
Topics for this session include:
- Added value services and applications on top of archives
- Techniques and tools for facilitating data access and use
- Approaches to supporting knowledge discovery
- Integrating user feedback into archives and repositories
- Validation and reanalysis of historic data sets
- Integration of new data sources and different types of data
- Return on investment for value add services
Session 3: Short - medium term issues related to policy, technology, guidelines, FAIR / TRUST principles, certification;
Whereas Certification of (trustworthy) digital repositories has been an issue for quite some time, FAIR and the recently proposed TRUST principles (see iPRES 2019) are newer. Whilst to some extent these are still being debated, we are still a long way from having all digital repositories certified, as we are from all data being fully FAIR compliant. Indeed, as we move to a world where storage and even services behind digital preservation are offered through the Cloud, it is unclear how at least some of these methodologies could apply. Moreover, there are some digital repositories that contain data that is still actively used, where the data was deposited prior to the OAIS standard being formalised. Whilst there is pressure from Funding Agencies to implement FAIR data as well as to develop and maintain Data Management Plans, these things will not stand still and will evolve throughout the lifetime of digital repositories. Can one adapt to all these changes? Must one adapt?
Topics for this session include:
- Experience with any of the above (or other relevant) methodologies and / or principles
- Alternative (?) models, such as the Digital Preservation Coalition's (DPC) Rapid Assessment Model (RAM)
- Experience with cloud-based solutions for digital preservation and data re-use
- Funding models for data re-use, such as Open Data, that is typically not covered by the initial project budget.
Session 4: (Very) long term issues.
Even though considerable advances have been made in the field of digital data preservation, the digital world is still in its infancy. For disciplines that need to preserve data as long as a human life - or even much much more - new issues arise. Looking back to the beginning of the digital age - arguably in the 1950s - what, if anything, remains of digital data that was created then? Looking forward a similar period of time - say to the end of the 21st century - will bring unknown and unforeseeable challenges. Who knows what storage technology will be widespread then and how can one plan for the many migrations and changes in architecture that will take place in the meantime? Can OAIS and associated certification methodologies ensure that today's data will still be accessible and usable at the end of the century? However challenging this may be, it is a very small step compared to what is needed in fields such as nuclear waste management - one of the numerous unfortunate legacies of the so-called anthropocene epoch. Despite the fact that these challenges might appear unsurmountable it does not mean that we should not address them.
Topics for this session include:
- Planning for changes in technology, personnel, services and funding streams ("easy");
- Thinking of scenarios where the "knowledge base" may differ considerably from that of today. This could be at many levels, for example where assumptions regarding computer architecture are no longer valid (but deeply ingrained in the knowledge and data that is captured today), where advances in science and technology might have a positive and possibly revolutionary effect (as we have seen many times over the past decades) and / or where even basic assumptions regarding language will no longer be valid.
Given that there are so many unknowns in this area, it is perhaps best not to attempt to define it in too much detail and leave the field open for ideas (and concerns).
Posters
Posters are requested to be in A0 portrait format (841 mm W x 1189 mm H).
Please also ensure that you email a digital copy of the poster to pv2023-loc@cern.ch so that your poster is accessible to our remote participants. Poster files will be stored by attachment to this Indico event.
CERN onsite printing offered
We can offer onsite printing of your poster, so that you do not need to travel with a poster-tube. We are not charging for this - it's free.
All you need to do is email pv2023-loc@cern.ch the finalised PDF file, and clearly stating that you want us to print it. We will submit the request to our Print Service on your behalf.
Please take note of the following conditions:
- Absolute deadline for us to receive your file for printing is 2 weeks before the event, on Monday 17 April.
- We will submit your file for printing, then as soon as we are notified it is ready for collecting (within ~2-3 days), we will email you a photo of the poster, for you to check you are happy with it.
- We will only print once, so if quality is not to your satisfaction, you will need to arrange to print and bring your own hard copy (hence the 2 week lead time in case of quality issues).
- We will not accept late updates.
- We will not offer any onsite printing after Monday 17 April, so if you have not sent us a file, you will be expected to bring your own hard copy.
Poster file preparation
In order to ensure the best quality results, please take note of the following pointers provided by our Print Service:
- Recommendations for file type: please use PDF files and remember to adjust the paper size inside your PDF application to A0 size and fit to paper A0.
- For PDF creation tips and other information, see: https://printservice.web.cern.ch/printservice/Services/pdf/PosterTips.pdf (login required, please try using a guest or social media account if you do not have a CERN account).