6th Open Science Practitioners Forum: Analysis Workflows

Europe/Zurich
Zoom Meeting ID
68448005373
Host
Merten Dahlkemper
Useful links
Join via phone
Zoom URL

 Discussion from the Chat

1. Training & Tutorials (ATLAS, RECAST, Snakemake, REANA)

 

 

Discussion on whether to include this topic in analysis software trainings is ongoing.

 

There used to be some RECAST-specific sessions, e.g.:

https://alexschuy.github.io/2020-08-27-usatlas-recast-tutorial/index.html

 

Clarification: reference was to the week-long ATLAS software tutorials:

https://atlas-software.docs.cern.ch/analysis/

 

Additional training material and events also exist.

 

 

Request for Tutorials

 

 

Question:

Could you provide the link to Snakemake/REANA tutorials?

 

Response:

 

Snakemake tutorial:

https://snakemake.readthedocs.io/en/stable/tutorial/tutorial.html

 

REANA tutorial:

https://hsf-training.github.io/hsf-training-reana-webpage/

 

(Note: The REANA tutorial mostly uses Yadage; Snakemake parts still need to be finalized.)

 


 

 

2. REANA Usage (LHCb Example)

 

 

REANA usage in LHCb is currently limited.

One example involved running an analysis on REANA using Snakemake underneath, which made setup relatively smooth:

 

https://indico.cern.ch/event/1380367/contributions/5880485/attachments/2831210/4946726/M_Sarpis_LHCb_Analysis_with_Snakemake.pdf

 


 

 

3. Snakemake with HTCondor

 

 

Snakemake works on lxplus via HTCondor using the following profile:

https://github.com/Snakemake-Profiles/htcondor

 

Executor plugin repositories:

 

  • https://github.com/jannisspeer/snakemake-executor-plugin-htcondor

  • https://github.com/htcondor/snakemake-executor-plugin-htcondor

 

 

 

Identified Issue

 

 

The HTCondor executor plugin does not allow specifying JobFlavour.

Jobs are submitted with the default 20-minute time limit, causing longer jobs to be aborted.

 

 

Possible Solution

 

 

Supporting the necessary Condor classads for JobFlavour should be straightforward (documentation or implementation update).

 

In the plugin README, custom job resources must be defined with a classad_ prefix, e.g.:

 

classad_JobFlavour

 

 

Usability Feedback

 

 

It can be confusing that there is:

 

  • an HTCondor cluster plugin, and

  • a profile that uses the generic-cluster plugin under the hood.

 

 

Some participants expressed willingness to help improve the plugin.

 

Additionally, some changes appear to have been submitted upstream:

https://github.com/jannisspeer/snakemake-executor-plugin-htcondor/pull/16

 


 

 

4. SLURM, MPI & GPU Support

 

 

Question:

Is Snakemake more modern now? Can we run SLURM/MPI jobs?

 

Response:

 

Yes — there is a SLURM executor plugin:

https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/slurm.html

 

The SLURM plugin supports:

 

  • MPI jobs

  • GPU jobs

 

 

Extended SLURM feature support is on the roadmap.

 


 

 

5. Grid Executors (ATLAS, CMS, LHCb)

 

 

A current limitation of Snakemake is lack of grid executor support.

 

Potential improvements:

 

  • ATLAS: PanDA executor plugin

  • CMS: CRAB integration

  • LHCb: Possibly DIRAC

 

 

Such integrations would significantly increase adoption.

 


 

 

6. Workflow Visualization

 

 

Question:

Are there tools to draw Snakemake workflows as diagrams?

 

Response:

Workflow visualization is an integral part of Snakemake.

 


 

 

7. Cross-Community Similarities & Computing Strategy

 

 

There are strong similarities between:

 

  • Particle physics

  • Astronomy

  • Other scientific communities

 

 

Computational workflow needs are largely similar.

Sociological factors often play as large a role as technical ones.

 

There are also plans to unify approaches to scientific computing in overlapping areas (e.g., within the ESCAPE project), including tools such as:

 

  • Rucio

  • REANA

 

 

User interface design and well-structured examples were highlighted as important adoption factors.

 


 

 

8. Software Provisioning (EESSI)

 

 

There is a long-term plan to provide Snakemake via the

European Environment for Scientific Software Installations (EESSI).

 

Development progress is ongoing but slow.

There are minutes attached to this event. Show them.