PyHEP.dev 2024 - "Python in HEP" Developer's Workshop

Name: PyHEP.dev 2024 - "Python in HEP" Developer's Workshop
Start: 2024-08-26T07:30:00+02:00
End: 2024-08-30T20:00:00+02:00
Location: Aachen, Germany

26–30 Aug 2024

Aachen, Germany

Europe/Brussels timezone

Contact

pyhepdev2024-organisation@cern.ch

Case for awkward array - efficient I/O for highly hierarchical data structure

Not scheduled

20m

Aachen, Germany

Erholungs-Gesellschaft Reihstraße 13, 52062 Aachen

Jerry 🦑 Ling (Harvard University (US))

We present from our developer experience, a case for adopting awkward array abstraction in data I/O library for highly hierarchical data set commonly as seen in HEP.

We will discuss both technical, performance-related lessons learned in implementing TTree/RNTuple reading, these are universal principles for all columnar data format I/O.

Specifically, we discuss the the importance of minimizing allocation, and lazy materialization due to sparse access pattern in HEP analysis.

Then we move to high-level, design challenges and how "owning" the entire data representation by using awkward array enables efficient / lazy data I/O that otherwise cannot be easily achieved otherwise.

Specifically, we will use the example of accessing "subset of complex data structures" (e.g. only jets.pt when event.jets isa Vector{LorentzVector}) to demonstrate the weakness of conventional approach and how awkward array provides a systematic way to do exact I/O instead of overtouching.

Jerry 🦑 Ling (Harvard University (US)) Pere Mato (CERN)

There are no materials yet.

PyHEP.dev 2024 - "Python in HEP" Developer's Workshop

Contact

Case for awkward array - efficient I/O for highly hierarchical data structure

Aachen, Germany

Speaker

Description

Authors

Presentation materials

Choose timezone

PyHEP.dev 2024 - "Python in HEP" Developer's Workshop

Contact

Speaker

Description

Authors

Presentation materials