Speaker
Description
As part of the IRIS-HEP software institute effort and U.S. CMS activities, the Coffea-Casa analysis facility team has executed an Integration Challenge. One goal of this challenge was to demonstrate a full CMS analysis running on the facility and to integrate the IRIS-HEP software stack into a production environment. We describe the solutions deployed at the facility to support and execute the challenge tasks.
The Nebraska facility provides more than 2,000 cores for fast-turnaround, low-latency analysis for analysts. To achieve the highest event-processing rates, multiple scaling backends were evaluated, including HTCondor and Kubernetes resources, using both Dask and TaskVine schedulers. This setup also enabled a comparison of two Dask-cluster management services—Dask LabExtension and Dask Gateway—under demanding conditions.
A robust set of XCache servers with a redirector, previously deployed and tested during the 200 Gbps Challenge, were used to cache CMS Integration Challenge datasets and reduce wide-area network traffic. In addition, the Integration Challenge explored several approaches for delivering skimmed physics data, including the use of different analysis frameworks and data formats. This required enabling multiple storage solutions at the facility, such as S3, and evaluating ServiceX for data skimming—a data-delivery system for high-energy physics designed to provide fast access to large datasets stored in ROOT and other HEP-specific formats.