Speaker
Description
Efficient and maintainable in-file metadata is crucial for large-scale event processing. The ATLAS experiment's Athena event-processing framework relies on complex navigational and metadata infrastructure to manage event processing across diverse workflows. As experimental demands grow, inefficiencies and redundancies in the current metadata infrastructure have constrained storage efficiency, affected reliability, and increased maintenance challenges.
We describe a comprehensive redesign of the metadata and navigation infrastructure, which handles data organization and retrieval, aimed at simplifying data relationships and reducing architectural and code duplication. By consolidating metadata handling into a more streamlined software design, our prototype improves both storage efficiency and maintainability.
We also survey metadata usage across workflows to normalize and deduplicate analysis metadata, creating a more robust and transparent system. Finally, we explore the potential of modern storage technologies, particularly RNTuple attributes, to support the evolution of in-file metadata structures. These developments complement I/O framework modernization by providing a coherent and efficient metadata layer atop the next-generation persistence backend.
Together, these improvements demonstrate a path toward more maintainable, efficient, and scalable metadata frameworks applicable to large-scale HEP software beyond ATLAS.