19–25 Oct 2024
Europe/Zurich timezone

Leveraging public cloud resources for the processing of CMS open data

21 Oct 2024, 13:30
18m
Room 2.B (Conference Room)

Room 2.B (Conference Room)

Talk Track 8 - Collaboration, Reinterpretation, Outreach and Education Parallel (Track 8)

Speaker

Kati Lassila-Perini (Helsinki Institute of Physics (FI))

Description

The CMS experiment at the Large Hadron Collider (LHC) regularly releases open data and simulations, enabling a wide range of physics analyses and studies by the global scientific community. The recent introduction of the NanoAOD data format has provided a more streamlined and efficient approach to data processing, allowing for faster analysis turnaround. However, the larger MiniAOD format retains richer information that may be crucial for certain research endeavors.

To ensure the long-term usability of CMS open data to their full extent, this work explores the potential of leveraging public cloud resources for the computationally intensive processing of the MiniAOD format. Many open data users may not have access to the necessary computing resources for handling the large MiniAOD datasets. By offloading the heavy lifting to scalable cloud infrastructure, researchers can benefit from increased processing power and improved overall efficiency in their data analysis workflows, with a moderate short-term cost.

The study investigates best practices and challenges for effectively utilizing public cloud platforms to handle the processing of CMS MiniAOD data, with a focus on quantifying the overall time and cost of using these resources. The ultimate aim is to empower the CMS open data community to maximize the scientific impact of this valuable resource.

Primary author

Kati Lassila-Perini (Helsinki Institute of Physics (FI))

Co-author

Presentation materials