28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Name: 28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)
Start: 2026-05-25T08:00:00+07:00
End: 2026-05-29T14:00:00+07:00
Location: Chulalongkorn University

25–29 May 2026

Chulalongkorn University

Asia/Bangkok timezone

Transfer Learning based Resource Usage Prediction in In-network Caching

26 May 2026, 17:45

18m

MHMK M01

Oral Presentation Track 1 - Data and metadata organization, management and access Track 1 - Data and metadata organization, management and access

Chin Guok (ESnet) Chin Guok

The rapid growth of data volumes in high-energy physics (HEP) collaborations, such as the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC), has necessitated the adoption of regional in-network caching strategies to mitigate data access latency. However, these caches often exhibit varying efficiencies across locations due to differing access patterns and storage policies. Improving resource utilization could significantly increase the performance of scientific computing infrastructure — yet exploring what-if scenarios for capacity planning has remained challenging.

This study investigates cache utilization patterns across three regional caches supporting the CMS experiment, situated in Southern California, Chicago, and Boston. We have developed two complementary prediction methodologies to forecast cache hit rates under hypothetical storage capacities: an LSTM-based model employing transfer learning, and a simpler analytical approach leveraging the footprint of active files for estimating cache hits. The transfer learning methodology utilizes observed modifications in storage capacity at the Southern California site to inform predictions for the Chicago and Boston caches, which have maintained their original capacities. A central contribution of this work is the application of these two distinct prediction techniques to cross-validate the results, thereby enhancing confidence in the what-if scenario analyses.

Our findings demonstrate that a two-fold increase in the storage capacity of the Chicago cache could potentially elevate its cache hit rates from 50% to 80%, significantly improving resource utilization. The integration of machine learning and analytical techniques presented herein offers a validated framework for optimizing cache efficiency, informing resource allocation, and guiding future cache deployments and resource management strategies within large-scale scientific collaborations.

Erica Wang (California Institute of Technology) Alex Sim (Lawrence Berkeley National Laboratory) Kesheng Wu (Lawrence Berkeley National Laboratory) Justas Balcas (ESnet) Brendan White (ESnet) Chin Guok (ESnet) INDER MONGA (Lawrence Berkeley National Laboratory) Diego Davila Foyo (Univ. of California San Diego (US)) Frank Wurthwein (UCSD) Harvey Newman (California Institute of Technology (US)) Chin Guok

CHEP26-XCache-2604.pdf

28th Conference on Computing in High Energy and Nuclear Physics (CHEP 2026)

Transfer Learning based Resource Usage Prediction in In-network Caching

MHMK M01

Speakers

Description

Authors

Presentation materials