19–25 Oct 2024
Europe/Zurich timezone

Advancing ATLAS DCS Data Analysis with a Modern Data Platform

24 Oct 2024, 14:24
18m
Room 1.B (Medium Hall B)

Room 1.B (Medium Hall B)

Talk Track 1 - Data and Metadata Organization, Management and Access Parallel (Track 1)

Speaker

Michelle Ann Solis (University of Arizona (US))

Description

This paper presents a novel approach to enhance the analysis of ATLAS Detector Control System (DCS) data at CERN. Traditional storage in Oracle databases, optimized for WinCC archiver operations, is challenged by the need for extensive analysis across long timeframes and multiple devices, alongside correlating conditions data. We introduce techniques to improve troubleshooting and analysis of ATLAS New Small Wheel (NSW) DAQ links, including data migration to Apache Parquet for efficient storage, and leveraging Big Data technologies like Apache Spark and Apache Hadoop for analysis. Employing Jupyter notebooks on the SWAN service, combined with Spark, Pandas, and the extensive Python ecosystem in general, facilitated a highly efficient analysis workflow. This approach was well-received by NSW experts, allowing them to rapidly gain proficiency and execute advanced analyses within a notably brief period.

Authors

Andrea Formica (Université Paris-Saclay (FR)) Luca Canali (CERN) Michelle Ann Solis (University of Arizona (US))

Presentation materials