Speaker
Description
The increasing adoption of columnar data formats and lightweight event representations, such as CMS NanoAOD, has made remote data access a significant factor in the performance of physics analysis workflows. In this context, understanding the performance characteristics of different data serving technologies under realistic network conditions is critical.
This work presents a comparative study of remote read performance for ROOT TTree and RNTuple data accessed over low and high-latency networks. We evaluate two server configurations: an XRootD server and an Nginx server, both delivering data over HTTPS with SciTokens-based authentication.
Beyond server-side differences, a key motivation of this study is to investigate the extent to which observed performance is limited by client-side behavior. In particular, we examine the impact of reusable HTTP connection pools in the ROOT client stack by comparing configurations with and without connection reuse enabled. This allows us to disentangle server capabilities from client library limitations and to identify potential bottlenecks arising from connection management under high-latency conditions.
Our results highlight the relative performance of Nginx and XRootD for remote analysis use cases and demonstrate that client-side connection handling can play a dominant role in overall performance. These findings provide guidance for optimizing remote analysis workflows and inform future developments in ROOT client libraries and data access strategies for distributed HEP computing.