Publication of statistical models: hands-on workshop

Europe/Zurich
CERN (online only)

CERN (online only)

Sabine Kraml (LPSC Grenoble)
Description

The statistical models used to derive the results of experimental analyses are of incredible scientific value and are essential information for analysis preservation and reuse. In arXiv:2109.04981, we made the scientific case for systematically publishing the full statistical models; we discussed the technical developments that make this practical, and illustrated by a variety of physics cases how detailed information on the statistical modelling can enhance the short- and long-term impact of experimental results

This workshop is intended as the first in a series to discuss in more detail practical issues for publishing statistical models and likelihoods, and work towards concrete solutions.

In this context note also the PHYSTAT workshop on systematics (Nov 1-3 + Nov 10) and in particular the talk by Kyle Cranmer on "A call to action: Honoring PHYSTAT's 20 year old agreement" at 6 pm CET on Nov 1st there, which will also in part set the stage for our workshop here.

Overall, apart from the first two days the workshop addresses a rather specialized audience, i.e. people want to who work on technical solutions for publishing and/or (re)using statistical models and likelihoods.

Slides and recordings of all sessions are available via the timetable.

Registration
Registration and expression of interests
Participants
  • Adil Jueid
  • Adinda De Wit
  • Ajay Kumar
  • Akram Khan
  • Alexander Held
  • Ali Dokhani
  • Andre Lessa
  • Andre Sznajder
  • Andrea Coccaro
  • Andrew Fowlie
  • Andrew Gilbert
  • Andy Buckley
  • Arnab Purohit
  • Arnau Morancho
  • Aurore Savoy Navarro
  • Benjamin Fuks
  • Boyang Zhang
  • Caspar Schmitt
  • Chang-Seong Moon
  • Clemens Lange
  • Clement Helsens
  • David F. Renteria-Estrada
  • David Richard Shope
  • Dimitri Bourilkov
  • Edmund Xiang Lin Ting
  • Emanuele Angelo Bagnaschi
  • Enzo Canonero
  • Florian Urs Bernlochner
  • Frederic Engelke
  • Gaël Alguero
  • Giordon Holtsberg Stark
  • Glen Cowan
  • Henrik Junkerkalefeld
  • Humberto Reyes-González
  • Ingo Schienbein
  • Isabel Dominguez
  • Itay Bloch
  • Jack Araz
  • Jaco ter Hoeve
  • Jacob Julian Kempster
  • Jamie Yellen
  • Javier Mauricio Duarte
  • Jieun Yoo
  • Jing-Ge Shiu
  • Joaquin Hoya
  • Jonas Eschle
  • Jonas Wittbrodt
  • Jonathan Butterworth
  • Jonathon Mark Langford
  • Juan José López
  • Juan Rojo
  • Judita Mamuzic
  • Juhi Dutta
  • Juhi Dutta
  • Karri Folan Di Petrillo
  • KC Kong
  • Louie Dartmoor Corpe
  • Luca Silvestrini
  • Lukas Alexander Heinrich
  • Maeve Madigan
  • Maria Moreno Llacer
  • Mario Arndt
  • Mark Neubauer
  • Massimiliano Galli
  • Matteo Bonanomi
  • Matthew Feickert
  • Maximilian Horzela
  • Michael Eliachevitch
  • Mohammed Mahmoud Mohammed
  • Monika Mittal
  • nahuel ferreiro
  • Nahuel Ferreiro Iachellini
  • Nicholas Wardle
  • Nick Manganelli
  • Nick Smith
  • Nicolas Berger
  • Oliver Majersky
  • Oliver Schulz
  • Pietro Vischia
  • Priyanka Cheema
  • Raghav Kansal
  • Ramon Orlando Ruiz Olais
  • Ravindra Kumar Verma
  • Riccardo Torre
  • Sabine Kraml
  • Sam Kaveh
  • Saroj Pokharel
  • Sebastian Hoof
  • Sezen Sekmen
  • Shehu AbdusSalam
  • Shiva Bikram Thapa
  • Tamas Almos Vami
  • Thomas Kuhr
  • Tim Adye
  • Tim Herrmann
  • Timothée Pascal
  • Tomas Dado
  • Vasiliki Mitsou
  • Veronica Sanz Gonzalez
  • Vincent Alexander Croft
  • Wolfgang Waltenberger
  • Zhuolin Zhang
    • 15:00 16:30
      Hands on pyhf 1h 30m

      Planned outline through pyhf tutorial material:

      • Introduction to HistFactory
      • Introduction to Workspaces
      • Modifiers
      • Workspace Manipulations
      • Using HEPData
      • Introduction to HistFactory Models with pyhf
      Speaker: Matthew Feickert (Univ. Illinois at Urbana Champaign (US))
    • 16:30 16:50
      Break 20m
    • 16:50 18:20
      Hands on Combine 1h 30m
      • differences wrt HistFactory
      • serialisation of Combine statistical models
      • usage/extension of pyhf JSON format?
      • ....
      Speaker: Andrew Gilbert (Northwestern University (US))
    • 15:00 16:30
      Summary talks from PHYSTAT workshop 1h 30m

      We join the PHYSTAT workshop to follow their summary talks and discussion --> https://indico.cern.ch/event/1051224/timetable/

    • 16:30 17:00
      Break 30m
    • 17:00 18:30
      Discussion on simplified likelihoods

      Approaches, schemes, limitations; simplifying and pruning full statistical models (cf 3rd bullet point of Section 5 in arXiv:2109.04981)

      Conveners: Andy Buckley (University of Glasgow (GB)), Nicholas Wardle (Imperial College (GB))
    • 15:00 16:30
      Free discussion and working session 1h 30m

      This is a free session for those who want to discuss something. No fixed program or topic. Join main Zoom room and move to separate discussion room if needed.

    • 15:30 17:00
      Reinterpretation and likelihoods sessions of the LLP workshop 1h 30m

      see https://indico.cern.ch/event/1042226/timetable/

      to join:
      URL: https://cern.zoom.us/j/66746428033?pwd=ZGYvZWo1dlExT3ZnamlRbzdlcHdoZz09
      Meeting ID: 66746428033
      Passcode: 96028080

    • 16:30 17:00
      Break 30m
    • 17:00 18:00
      Joint session with LLP workshop

      This is a joint session with the parallel "Long-lived Particle Community Workshop" where publication of likelihoods will be discussed. Please note the alternative indigo page and zoom details !

      https://cern.zoom.us/j/69432885993?pwd=VWFEMnJFVFVxaW5JMnY2Rk53bzEvZz09
      Passcode: 96028080

      https://indico.cern.ch/event/1042226/timetable/#b-440156-re-interpretations-an

      Convener: Louie Dartmoor Corpe (CERN)
    • 15:00 16:30
      Machine learning likelihoods (and statistical models)

      We intend to discuss the main ideas related to interpolating likelihoods and statistical models using (Deep) Neural Networks. The main topics and open questions/issues are:
      - Bayesian vs Frequentist statistical approaches and their relations to the neural network representation of the Likelihood (e.g. combination of likelihoods and double counting of constraint terms vs priors, likelihood vs statistical model)
      - Interpolation of full statistical models through NN vs other established approaches
      - Regression vs density estimation (supervised vs unsupervised Likelihood learning)
      - Practical implementations within experiments
      - Practical implementations outside experiments (fitting groups)
      - Examples

      Conveners: Andrea Coccaro (INFN Genova (IT)), Riccardo Torre (INFN e Universita Genova (IT))
    • 16:30 17:00
      break 30m
    • 17:00 18:30
      Discussion on measurement-unfolding tools and combinations of LHC searches and measurements
      Convener: Andy Buckley (University of Glasgow (GB))
      • 17:00
        Intro 1m
        Speaker: Andy Buckley (University of Glasgow (GB))
      • 17:05
        RooUnfold (esp IBU) 1m
        Speaker: Vincent Alexander Croft (Tufts University (US))
      • 17:10
        TUnfold 1m
        Speaker: Stefan Schmitt (Deutsches Elektronen-Synchrotron (DE))
      • 17:15
        PyFBU 1m
        Speaker: Clement Helsens (CERN)
      • 17:20
        TRExFitter 1m
        Speaker: Michele Pinamonti (Universita degli Studi di Udine (IT))
      • 17:25
        Convino 1m
        Speaker: Jan Kieseler (CERN)
      • 17:30
        Input from measurements with public stat models 1m
      • 17:35
        Inputs from search, combination, and recasting 1m