Speaker
Description
DUNE is a next-generation neutrino oscillation experiment. During its decades-long operational lifetime, it is expected that many exabytes of heterogeneous data will be collected. It is critical that this data be correctly characterized with respect to its associated conditions metadata – the non-event data used to process event data during offline reconstruction and analysis. To support DUNE’s offline computing system requirements of scalability, operational efficiency, maintainability, automatability, extensibility, data integrity, and portability – while improving upon existing conditions metadata management systems – a modern, AI-native, conditions metadata exchange – Metadex – is being developed.
Metadex is designed to be highly available, performant, scalable, secure, maintainable, and portable. It will incorporate high-performance technologies and AI-native capability throughout its architecture, leverage a multi-agent architecture to automate ETL / ELT pipelines. This includes pipeline workflow planning, flagging data anomalies, gracefully handling pipeline error conditions, dynamic schema mapping and data transformation, and dynamic query construction from natural language input. While the initial focus is on supporting DUNE conditions metadata use cases, the intent for Metadex is to be HEP experiment agnostic and extensible to support a variety of metadata management uses cases.