Oct 27 – 30, 2025
CERN
Europe/Zurich timezone

Coffea Framework: Current Status and Recent Updates

Oct 30, 2025, 2:00 PM
50m
222/R-001 (CERN)

222/R-001

CERN

200
Show room on map

Speaker

Iason Krommydas (Rice University (US))

Description

This tutorial will provide a comprehensive introduction to the current state of Coffea (Columnar Object Framework for Effective Analysis), focusing on its transition to virtual arrays as the primary backend for efficient HEP data processing. With the introduction of Awkward Array's Virtual Arrays feature, Coffea now offers lazy data loading capabilities that dramatically reduce memory consumption while maintaining the familiar, user-friendly analysis syntax that physicists expect.

The tutorial will begin with an introduction to columnar analysis concepts, demonstrating how Coffea enables physicists to work with complex, nested particle physics data using familiar NumPy-like operations. Through interactive Jupyter examples, participants will learn to structure typical HEP analyses, from basic event selection to histogramming.

A key focus will be the seamless migration path from Coffea 0.7 to the current virtual arrays implementation. Attendees will see how existing analysis code requires minimal modifications—often none at all—to benefit from lazy loading capabilities that dramatically reduce memory consumption while maintaining computational efficiency.

The session will cover advanced optimization techniques including explicit branch preloading for network-efficient data access, workflow tracing to identify required branches for efficient bulk loading, and new checkpointing features for robust, resumable workflows. Practical examples will demonstrate how virtual arrays work transparently behind familiar analysis patterns, allowing physicists to focus on physics rather than data management details.

Author

Iason Krommydas (Rice University (US))

Presentation materials