2nd S2I2 HEP/CS Workshop

US/Eastern
Princeton University

Princeton University

Daniel S. Katz (University of Illinois), Douglas Thain (University of Notre Dame), Jim Pivarski (Princeton University), Mark Neubauer (Univ. Illinois at Urbana-Champaign (US)), Michael David Sokoloff (University of Cincinnati (US)), Oliver Gutsche (Fermi National Accelerator Lab. (US)), Peter Elmer (Princeton University (US)), Robert William Gardner Jr (University of Chicago (US)), Sergei Gleyzer (University of Florida (US))
Description

The worldwide particle physics community is currently planning upgrades to the Large Hadron Collider (LHC) at CERN in Geneva. The LHC today already uses a worldwide distributed computing grid to meet the needs of thousands of scientists to process and analyze some of the world's largest scientific datasets. The upgrades being planned will increase data volumes by more than two orders of magnitude and require significantly more complex data and analysis techniques.

This 2nd S2I2 HEP/CS workshop aims to bring together a diverse set of attendees from the high energy physics (HEP) and computer science (CS) communities to understand how the two communities could work together in the context of a future NSF Software Institute aimed at supporting particle physics research over the long term. We will build on the discussions which took place at the the first S2I2 HEP/CS workshop and take a fresh look at planned HEP and computer science research and brainstorm about engaging specific areas of effort, perspectives, synergies and expertise of mutual benefit to HEP and CS communities, especially as it relates to a future NSF Software Institute for HEP.

Discussions and sessions include Science Practices & Policies, Sociology and Community Issues, Machine Learning, Software Life Cycle / Software Engineering / Software/Data/Workflow Preservation & Reproducibility, Scalable Platforms, Data Management, Access, Distribution, Organization, Data Intensive Analysis Tools and Techniques, Visualization, Data Streaming and Training, Education, Professional Development, Advancement.

The meeting rooms at Princeton are:

    • 13
      Parallel Session - Data Management, Access and Organisation / Data Streaming Jadwin Hall 475

      Jadwin Hall 475

    • 14
      Parallel Session - Machine Learning, Algorithms Jadwin Hall 111

      Jadwin Hall 111

    • 15
      Parallel Session - Software Life Cycle / Software Engineering McDonnell Hall 103

      McDonnell Hall 103

    • 10:30
      Coffee Break
    • 16
      Parallel Session - Data Management, Access and Organisation / Data Streaming Jadwin Hall 475

      Jadwin Hall 475

      • a) Discussion
    • 17
      Parallel Session - Machine Learning, Algorithms Jadwin Hall 111

      Jadwin Hall 111

      • a) Discussion
    • 18
      Parallel Session - Software Life Cycle / Software Engineering McDonnell Hall 103

      McDonnell Hall 103

    • 12:30
      Lunch
    • 19
      Parallel Session - Data Intensive Analysis Tools & Visualization Jadwin Hall 111

      Jadwin Hall 111

      • a) Introduction
        Speaker: Jim Pivarski (Princeton University)
      • b) Lightning Talk: Constructing a ROOT-less workflow with python and HDF5
        Speaker: Matthew Bellis (Siena College)
      • c) Lightning Talk: Machine learning pipelines with Spark ML
        Speaker: Dr Alexey Svyatkovskiy (Princeton University)
      • d) Lightning Talk: XENON1T, Open Source and Python
        Speaker: Christopher Tunnell (Enrico Fermi Institute-University of Chicago-Unknown)
      • e) Lightning Talk: Volumetric image analysis and visualization problems in neuroimaging
        Speaker: Lawrence Frank (UCSD)
    • 20
      Parallel Session - Scalable Platforms Jadwin Hall A06

      Jadwin Hall A06

    • 21
      Parallel Session - Software/Data/Workflow Preservation & Reproducibility Jadwin Hall 475

      Jadwin Hall 475

      • a) Introduction
        Speakers: Carlos Maltzahn (University of California - Santa Cruz), Mike Hildreth (University of Notre Dame (US))
      • b) Lightning Talk: Non-determinism in applications at the exascale: impact on debugging and numerical reproducibility
        Speaker: Michela Taufer (University of Delaware)
      • c) Lightning Talk: Recast, Reana, and HepData: infrastructure for reproducibility and reinterpretation
        Speaker: Lukas Alexander Heinrich (New York University (US))
      • d) Lightning Talk: The Popper Framework
        Speaker: Carlos Maltzahn (University of California - Santa Cruz)
    • 15:00
      Coffee Break
    • 22
      Parallel Session - Data Intensive Analysis Tools, Visualization Jadwin Hall 111

      Jadwin Hall 111

      • a) Discussion
    • 23
      Parallel Session - Scalable Platforms Jadwin Hall A06

      Jadwin Hall A06

      • a) Discussion
    • 24
      Parallel Session - Software/Data/Workflow Preservation & Reproducibility Jadwin Hall 475

      Jadwin Hall 475

    • 25
      Training, Education, Professional Development, Advancement Lewis Library 138

      Lewis Library 138

    • 26
      Physics Analysis Training Model at the CMS Experiment Lewis Library 138

      Lewis Library 138

      Speaker: Sudhir Malik (University of Puerto Rico (PR))
    • 27
      Discussion - Training Lewis Library 138

      Lewis Library 138

    • 28
      Summary - Software Life Cycle / Software Engineering Lewis Library 138

      Lewis Library 138

      Speakers: Elizabeth Sexton-Kennedy (Fermi National Accelerator Lab. (US)), Jeffrey Carver (University of Alabama)
    • 29
      Summary - Software/Data/Workflow Preservation & Reproducibility Lewis Library 138

      Lewis Library 138

      Speakers: Carlos Maltzahn (University of California - Santa Cruz), Mike Hildreth (University of Notre Dame (US))
    • 30
      Summary - Machine Learning, Algorithms Lewis Library 138

      Lewis Library 138

      Speaker: Sergei Gleyzer (University of Florida (US))
    • 10:30
      Coffee Break
    • 31
      Summary - Data Intensive Analysis Tools, Visualization Lewis Library 138

      Lewis Library 138

      Speaker: Fernanda Psihas (Indiana University)
    • 32
      Summary - Scalable Platforms Lewis Library 138

      Lewis Library 138

      Speakers: Douglas Thain (University of Notre Dame), Robert William Gardner Jr (University of Chicago (US))
    • 33
      Summary - Data Management, Access and Organisation/Data Streaming Lewis Library 138

      Lewis Library 138

      Speakers: Oliver Gutsche (Fermi National Accelerator Lab. (US)), Tanu Malik (Depaul)
    • 34
      Discussion - Next Steps Lewis Library 138

      Lewis Library 138

    • 35
      Closeout Lewis Library 138

      Lewis Library 138

    • 13:00
      Take-Away Lunch