Speaker
Description
Analysis Productions is a declarative n-tupling service which has processed over 1 exabyte of LHCb data since 2024 with the DIRAC Transformation System. It is the primary method for producing LHCb ntuples for analysis and has produced approximately 50M files.
Since the start of Run 3 the demand for n-tuples increased dramatically, with 22k samples created in 2025 alone, which led to significantly increased workload on Grid resources, notably storage, while intensifying operational requirements. Meeting this scaling challenge has led to increased automation with sensible checks and balances to ensure resources continue to be used efficiently and responsibly.
Analysis Productions rises to the challenge with a comprehensive suite of capabilities. A feature-rich web interface provides a user-friendly platform for browsing continuous integration test results and managing LHCb's extensive collection of n-tuples, substantially reducing prototyping overhead for analysts.
Beyond standard LHCb n-tupling workflows, the service supports configuring and running custom n-tuple filtering and transformation steps at massive scale with ROOT RDataFrame, among countless other possibilities.
To ensure responsible storage utilisation, sample lifecycle management features provide accountability for working groups and sample owners to ensure timely cleanup of obsolete n-tuples, with automated role and permission assignment through LbFence (Glance) integration.
Finally, preservation by default ensures permanent metadata retention with automatic archival to tape storage for n-tuples linked to a publication, safeguarding long-term reproducibility.
This talk demonstrates how Analysis Productions functions as a production-ready exascale analysis facility, minimising operational burden while providing an intuitive, continuously evolving interface for analysts and working groups to seamlessly produce, monitor, access, and manage n-tuples for their analysis.