9–13 Mar 2026
Europe/Berlin timezone

HEP Specific Workflows

11 Mar 2026, 15:00
1h

Speakers

Clemens Lange (Paul Scherrer Institute (CH)) Lukas Alexander Heinrich (Technische Universitat Munchen (DE)) Matthew Feickert (University of Wisconsin Madison (US))

Description

HEP-related links:
* https://indico.cern.ch/event/1643846/
* paramSet for systematic uncertainties
* wildcards for systematics soon
* mind that one should not have too many files in a single directory (also applies to logs)
* LHCb:
- https://lhcb.github.io/starterkit-lessons/first-analysis-steps/analysisflow.html
- https://hsf-training.github.io/analysis-essentials/snakemake/README.html
* CMS: https://alefisico.github.io/snakemake-cms-tutorial/
* SLURM example: https://github.com/clelange/snakemake-psi-tier3-example, see also https://github.com/lukasheinrich/snakemake-mpcdf/
* HTCondor example: https://github.com/matthewfeickert/snakemake-lxplus-example
* Storage backend plugins: https://github.com/snakemake/snakemake-interface-storage-plugins
* XRootD plugin: https://github.com/snakemake/snakemake-storage-plugin-xrootd

Other items:
* Create a workflow gallery
* checkpoint example
* Develop training with HEP-specific examples
* Allow shared state, running on different clusters

Storage backends: S3, Git LFS, ...
Logger interface plugin that posts GitHub status on commit (but does not generate new commit, just reflects which version of the workflow is running)
GitPython to use git as storage backend?

https://github.com/snakemake/snakemake-interface-storage-plugins

XRootD storage plugin exists

Detach Snakemake execution so that screen/tmux not needed
* make use of job IDs in persistence database
Then wake up e.g. using CronJob (go to sleep if no new jobs to handle within N minutes)
* Snakemake Issue: https://github.com/snakemake/snakemake/issues/4084

Presentation materials