Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

4–8 Nov 2019
Adelaide Convention Centre
Australia/Adelaide timezone

Vectorized, imperative, and declarative processing of Awkward Arrays

7 Nov 2019, 15:00
15m
Riverbank R2 (Adelaide Convention Centre)

Riverbank R2

Adelaide Convention Centre

Oral Track 5 – Software Development Track 5 – Software Development

Speaker

Jim Pivarski (Princeton University)

Description

Over the past two years, the uproot library has become widely adopted among particle physicists doing analysis in Python. Rather than presenting an event model, uproot gives the user an array for each particle attribute. In case of multiple particles per event, this array is jagged: an array of unequal-length subarrays. Data structures and operations for manipulating jagged arrays are provided by the awkward-array library, which also includes array types for nested structures, nullability, and heterogeneity.

The primary mode of awkward-array manipulation is vectorized in the sense of a Single Python Instruction on Multiple Data (“virtual machine SIMD”). Many implementations of these high-level instructions are also vectorized in the hardware sense. But whereas the implicit loops of vectorized instructions make some calculations easier, they make others harder, especially algorithms that iterate until a convergence condition is met.

To support algorithms with explicit for loops, we have extended awkward-array to work with Numba, a popular Python JIT-compiler. Vectorized and imperative programming styles may now be used interchangeably on the same awkward-array structures, with no loss of performance.

These two programming styles, column-first vectorized and row-first imperative, differ in their order of execution. We will also show progress on a declarative interface, which doesn’t specify the execution order, allowing the implementation to optimize it independently of the content of the physics analysis itself.

Consider for promotion Yes

Primary authors

Jim Pivarski (Princeton University) David Lange (Princeton University (US)) Peter Elmer (Princeton University (US))

Presentation materials