ALICE Run 3 Analysis Framework

18 May 2021, 11:29
13m
Short Talk Offline Computing Software

Speaker

Anton Alkin (CERN)

Description

In LHC Run 3 the ALICE Collaboration will have to cope in Run 3 with an increase of lead-lead collision data of two orders of magnitude com- pared to the Run 1 and 2 data-taking periods. The Online-Offline (O$^2$) software framework has been developed to allow for distributed and efficient process- ing of this unprecedented amount of data. Its design, which is based on a message-passing back end, required the development of a dedicated Analysis Framework that uses columnar data format provided by Apache Arrow. The O2 Analysis Framework provides a user-friendly high-level interface and hides the complexity of the underlying distributed framework. It allows the users to access and manipulate the data in the new format both in the traditional "event loop" and a declarative approach using bulk processing operations based on Arrow’s Gandiva sub-project. Building on the well-tested system of analysis trains developed by ALICE in Run 1 and 2, the AliHyperloop infrastructure is being developed. It provides a fast and intuitive user interface for running demand- ing analysis workflows in the GRID environment and on the dedicated Analysis Facility. In this document, we report on the current state and ongoing develop- ments of the Analysis Framework and of AliHyperloop, highlighting the design choices and the benefits of the new system.

Primary authors

Anton Alkin (CERN) Giulio Eulisse (CERN) Jan Fiete Grosse-Oetringhaus (CERN) Maja Jadwiga Kabus (Warsaw University of Technology (PL)) Peter Hristov (CERN)

Presentation materials

Proceedings

Paper