PyHEP.dev 2025 - "Python in HEP" Developer's Workshop

Name: PyHEP.dev 2025 - "Python in HEP" Developer's Workshop
Start: 2025-07-14T07:30:00-07:00
End: 2025-07-17T13:00:00-07:00
Location: Seattle, Washington

14–17 Jul 2025

Seattle, Washington

US/Pacific timezone

Contact

pyhepdev2025-organisation@cern.ch

Session

Talks

14 Jul 2025, 09:00

Seattle, Washington

University of Washington

There are no materials yet.

40. Introduction: Andres Rios-Tascon

Andres Rios-Tascon (Princeton University)

14/07/2025, 09:00

18. Introduction: Samantha Abbott

Samantha Abbott (University of California Davis (US))

14/07/2025, 09:05

26. Introduction: Peter Fackeldey

Manfred Peter Fackeldey (Princeton University (US))

14/07/2025, 09:10

20. Introduction: Elise Chavez

Elise Chavez (University of Wisconsin Madison (US))

14/07/2025, 09:15

42. Introduction: Nick Smith

Nick Smith (Fermi National Accelerator Lab. (US))

14/07/2025, 09:20

22. Introduction: Artur Cordeiro Oudot Choi

Artur Cordeiro Oudot Choi (University of Washington (US))

14/07/2025, 09:25

23. Introduction: Sari Damen

Sari Damen (university of kocaeli)

14/07/2025, 09:30

34. Introduction: Kelci Mohrman

Kelci Ann Mohrman (University of Florida (US))

14/07/2025, 09:35

25. Introduction: Jason Detwiler

Jason Detwiler

14/07/2025, 09:40

37. Introduction: Ianna Osborne

Ianna Osborne (Princeton University)

14/07/2025, 09:45

30. Introduction: Iason Krommydas

Iason Krommydas (Rice University (US))

14/07/2025, 09:50

28. Introduction: Lindsey Gray

Lindsey Gray (Fermi National Accelerator Lab. (US))

14/07/2025, 09:55

1. Awkward Array: The Swiss Army Knife of Irregular Data (and Still a Little Awkward)

Ianna Osborne (Princeton University)

14/07/2025, 10:15

Awkward Array is a stable and widely used Python library for working with nested, variable-length, and irregular data — the kind of data that traditional NumPy arrays can’t easily handle. Originally developed for high-energy physics, it has grown into a reliable tool for many fields beyond HEP.

Today, Awkward Array offers strong integration with libraries like NumPy, Numba, JAX, and GPU...

2. Lazy Data Loading with "Virtual Arrays" in Awkward

Iason Krommydas (Rice University (US))

14/07/2025, 10:35

High-energy physics (HEP) analyses frequently manage massive datasets that surpass available computing resources, requiring specialized techniques for efficient data handling. [Awkward Array][1], a widely adopted Python library in the HEP community, effectively manages complex, irregularly structured ("ragged") data by mapping flat arrays into nested structures that intuitively represent...

9. Towards rapid and efficient columnar-based analyses at scale

Kelci Ann Mohrman (University of Florida (US))

14/07/2025, 10:55

As we pursue new physics at the LHC, the challenge of efficiently analyzing our rapidly mounting data volumes will continue to grow. This talk will describe the development and benchmarking of a realistic columnar-based end-user analysis workflow (for skimming Run 2 + Run 3 scale data with the Coffea framework) in order to characterize the current capabilities and understand bottlenecks as we...

3. rootfilespec

Nick Smith (Fermi National Accelerator Lab. (US))

14/07/2025, 11:30

The rootfilespec package is designed to efficiently parse ROOT file binary data into python datastructures. It does not drive I/O and expects materialized bytes buffers as input. It also does not return any types beyond python dataclasses of primitive types (and numpy arrays thereof). The goal of the project is to provide a stable and feature-complete read/write backend for packages such as uproot.

5. RNTuple support in Scikit-HEP

Andres Rios-Tascon (Princeton University)

14/07/2025, 11:50

RNTuple is an new columnar data storage format with a variety of improvements over TTree. The first stable version of the specification became available in 6.34, at the beginning of the year. Thus, we have entered the transition period where our software migrates from TTrees to RNTuples. The Uproot Python library has stayed in the forefront of this transition, and already has fairly...

11. Accelerating binned  Likelihood fits in HEP  with JAX

Manfred Peter Fackeldey (Princeton University (US))

14/07/2025, 12:10

Binned Likelihoods (and optimizations of thereof) in HEP offer various parallelization opportunities. This talk discusses those opportunities, and discusses how they can be implemented using the JAX package. Finally, the evermore package is presented as a show-case that enables those optimizations with JAX already.

29. Introduction: Roger Janusiak

Roger Janusiak (University of Washington)

15/07/2025, 09:00

27. Introduction: Massimiliano Galli

Massimiliano Galli (Princeton University (US))

15/07/2025, 09:05

31. Introduction: Nikolai Krug

Nikolai Krug (Ludwig Maximilians Universitat (DE))

15/07/2025, 09:10

32. Introduction: Isaac Kunen

Isaac Kenneth Kunen

15/07/2025, 09:15

33. Introduction: Leon Lin

Yuan-Ru Lin (University of Washington (US))

15/07/2025, 09:20

36. Introduction: Dennis Daniel Nick Noll

Dennis Daniel Nick Noll (Lawrence Berkeley National Lab (US))

15/07/2025, 09:25

35. Introduction: George Marshall

George Marshall (University of Washington)

15/07/2025, 09:30

49. Introduction: Henry Fredrick Schreiner

Henry Fredrick Schreiner (Princeton University)

15/07/2025, 09:35

24. Introduction: Jonas Eschle

Jonas Eschle

15/07/2025, 09:40

38. Introduction: Saheed Oyeniran

Saheed Oyeniran (University of New Mexico)

15/07/2025, 09:45

39. Introduction: Mason Proffitt

Mason Proffitt (University of Washington (US))

15/07/2025, 09:50

17. Introduction: Matthew Feickert

Matthew Feickert (University of Wisconsin Madison (US))

15/07/2025, 09:55

16. HEP Packaging Coordination: Reproducible reuse by default

Matthew Feickert (University of Wisconsin Madison (US))

15/07/2025, 10:15

While advancements in software development practices across particle physics and adoption of Linux container technology have made substantial impact in the ease of replicability and reuse of analysis software stacks, the underlying software environments are still primarily bespoke builds that lack a full manifest to ensure reproducibility across time. The [HEP Packaging...

14. Histogram Serialization

Henry Fredrick Schreiner (Princeton University)

15/07/2025, 10:35

This talk covers histogram serialization development. We'll take a look at the new serialization specification being developed in UHI, we'll look at how libraries can be developed to support serialization (such as boost-histogram), and work through some examples.

This is intended to be an introduction to serialization so that it can be a hackathon/sprint target later.

7. News and overview of the fitting ecosystem

Jonas Eschle

15/07/2025, 10:55

In this talk, I plan to give an informal overview over the current fitting ecosystem used in HEP (mainly with pyhf, zfit , hepstats, evermore,...). The talk covers current efforts, needs and future plans and challenges and discussed model building, inference, optimization/inference, serialization, interchange and backends.

51. The LEGEND-200 Analysis Framework

George Marshall

15/07/2025, 11:30

The LEGEND Collaboration has developed a fully python based framework for its data analysis and processing. The framework is comprised of 5 main packages: lgdo for handling the data objects, dspeed for fast digital signal processing, pygama for the calibration and optimisation routines, pylegendmeta for handling metadata/configs and legend-dataflow which uses snakemake to manage the data...

15. Using Commodity Data Tools in LEGEND-1000

Isaac Kenneth Kunen

15/07/2025, 11:50

The current phase of the LEGEND neutrinoless double-beta decay search, LEGEND-200, holds its primary experimental data in a customized HDF5 format, This requires the team to build and maintain a significant custom data access layer that lies outside the team’s core physics mission and expertise, and the performance and complexity of the system impacts both data production pipelines and...

41. Introduction: Jerry Ling

Jerry 🦑 Ling (Harvard University (US))

16/07/2025, 09:00

21. Introduction: Yehyun Choi

Yehyun Choi

16/07/2025, 09:05

43. Introduction: Giordon Holtsberg Stark

Dr Giordon Holtsberg Stark (University of California,Santa Cruz (US))

16/07/2025, 09:10

44. Introduction: Louis Varriano

Louis Varriano (University of Washington)

16/07/2025, 09:15

45. Introduction: Gordon Watts

Gordon Watts (University of Washington (US))

16/07/2025, 09:20

46. Introduction: Peter Zabback

Peter Zabback (UW / CENPA)

16/07/2025, 09:25

47. Introduction: Max Zhao

Max Zhao (Princeton University (US))

16/07/2025, 09:30

48. Introduction: Samuel Borden

Sam Borden (University of Washington, CENPA)

16/07/2025, 09:35

19. Introduction: Pankaj Kumar Bind

Pankaj Kumar Bind (Uka Tarsadia University, India)

16/07/2025, 09:40

54. pyOpenSci: Jonas Eschle

Jonas Eschle

16/07/2025, 09:45

52. Static Compilation in Julia -- using FHist.jl as an example

Jerry 🦑 Ling (Harvard University (US))

16/07/2025, 10:15

In the past year, development in Julia has lead to the ability to statically compile small (relative to full runtime and LLVM) binaries.

In this presentation we quickly go over the basic principle of it, the challenge of it, and demonstrate a proof-of-concept binding to FHist.jl.

Finally, we discuss what are some potential future usage as well as the on-going development in larger...

6. FlexCAST: Enabling Flexible Scientific Data Analyses

Dennis Daniel Nick Noll (Lawrence Berkeley National Lab (US))

16/07/2025, 10:35

The development of scientific data analyses is a resource-intensive process that often yields results with untapped potential for reuse and reinterpretation. In many cases, a developed analysis can be used to measure more than it was designed for, by changing its input data or parametrization. Building on the RECAST approach, which enables the reinterpretation of a physics analysis in the...

13. Common interface for end-of-analysis statistics with general PyTree operations

Max Zhao (Princeton University (US))

16/07/2025, 10:55

Statistical procedures at the end stages of analysis such as hypothesis testing. likelihood scans, and pull plots are currently implemented across multiple Python packages, yet lack interoperability despite performing similar functions once the log-likelihood is constructed. We present a contribution to HEPStats of the Scikit-HEP ecosystem to provide a common interface for these final stages...

8. Graph Me If You Can: Modern Python Meets HEP Statistical Models

Dr Giordon Holtsberg Stark (University of California,Santa Cruz (US))

16/07/2025, 11:30

Statistical tooling in the scientific python ecosystem continues to advance, while at the same time ROOT has recently adopted the HEP Statistics Serialization Standard (HS3) as the way of serializing RooWorkspaces for any probability model that has been built. There is a gap between packages such as jax and scipy.stats and what HS3 provides. This is where pyhs3 comes in—a modern...

12. A tool for unbinned frequentist inference for quasi-background free searches

Sam Borden (University of Washington, CENPA)

16/07/2025, 11:50

Current statistical inference tools in high-energy physics typically focus on binned analyses and often use asymptotic approximations to draw statistical inferences. However, present and future neutrinoless double beta decay experiments, such as the Large Enriched Germanium Experiment for Neutrinoless ββ Decay (LEGEND), operate in a quasi-background free regime, where the expected number of...

10. Packaging Collaboration Offline Software

Dr Giordon Holtsberg Stark (University of California,Santa Cruz (US))

16/07/2025, 12:10

Is it possible for all individual collaboration software to be packaged and maintained on conda-forge? There are lots of caveats involved from the non-technical aspects including licensing and usage; to technical aspects such as cross-compilation and the larger number of dependencies and configuration / parallel releases that may make this challenging. The collaborations I am thinking about...

Building timetable...

Choose timezone

PyHEP.dev 2025 - "Python in HEP" Developer's Workshop

Contact

Presentation materials