CernVM-FS Coordination Meeting

Name: CernVM-FS Coordination Meeting
Start: 2021-05-11T17:00:00+02:00
End: 2021-05-11T18:10:00+02:00
Location: CERN

Tuesday 11 May 2021, 17:00 → 18:10 Europe/Zurich

32/R-C06 (CERN)

32/R-C06

CERN

Videoconference

CernVM-FS Coordination Meeting

Zoom Meeting ID: 95731093939
Host: Jakob Blomer
Useful links: Join via phone
Zoom URL

Hide

Notes

Present: Jakob, Dave, Clemens, Alessandra, Lukas, Jose, Andrea, Catalin, Bob, Liam, Kenneth, Ryan

Inroduction Slide (Dave)

Containers more difficult than tarballs: large containers for small code

- template transactions would help a lot

Still: lot more publishing work, even more than the dailies probably

- Smaller each time but many transactions, multiple transactions per day

Extremely small latency expected - development cycle

Prevent large transaction blocking a small one

User code containers (Clemens)

- Experiment framework: 35GB

Q: Weren't they 10GB
A: They grew, code itself only ~2GB but more and more dependencies

- NanoAOD user code does not depend on CMSSW --> user containers only a few 100MB in size

- Little adoption for using containers for analysis code

- Analysis code is supposed to be visible only to CMS

Q: Wouldn't it be enough to have the source code protected and the executables open?
A: Maybe

Q: Situation similar in ATLAS?
A: Container images regularly used, integrated in grid middleware, special base containers, privacy is not so much of an issue for the executables in containers

- Idea: seeing cvmfs as a CDN, all calibration data on /cvmfs synced from EOS. Would be nice to sync gitlab --> cvmfs

User containers can vary in size, reduced ATLAS stack only 1GB

What's the expected scale: usually 3 containers per analysis (selection, analysis, ML)
- Bulk of changes in upper parts
- Ask gitlab team
- We could look at what's currently done with tarballs
- Order of hundreds of concurrent analyses
- We'd need to figure out how often new batches with code updates are sent to the grid
- No train analysis systems foreseen

Alessandra: users don't want to convert containers to tarballs for the grid, when they already work as containers at CERN

Clemens: user experience: would be good if everybody used containers (reproducibility), critical that turn-around is fast (10 minutes too much)
- On the grid: delay of minutes is in the noise
- On local farm: delay can be down to 2 minutes
- Container build time should dominate the propagation

Lukas: CentOS 8 can build as unprivileged, so you test locally

Alessandra: Discourage users from publishing intermediate containers but test first, only publish when going large-scale

Can there be a scheduled downtime?
- Current tarball upload server could publish to /cvmfs
- Regular downtime probably unacceptable

Cvmfs VOMS restricted repository probably not practible for CMS
- We should rather look into code obfusciation or filtering out source code
- Perhaps rather on the "nice to have" side, other points should be addressed first

How to proceed from here
- WLCG task force? (Should include gitlab synchronization? Different problem)
- Could be handed over to container registry
- CernVM devel team can convert the input into development items

Can we get better guesses at the experiment requirements and scale?
- We could ask current users of unpacked.cern.ch

There are minutes attached to this event. Show them.

- 17:00 → 17:45
  
  Focus topic: user code and user container publication 45m
  
  Speakers: Alessandra Forti (University of Manchester (GB)), Clemens Lange (CERN), Dave Dykstra (Fermi National Accelerator Lab. (US)), Jakob Blomer (CERN), Lukas Alexander Heinrich (CERN)
  
  20210511_CVMFS_UserCodeContainersCMS.pdf
  
  UserContainerRequirements20210510.pdf

Choose timezone

CernVM-FS Coordination Meeting

32/R-C06

CERN

Notes

Share this page

Direct link

Social networks

Calendaring