GridPP Technical Meeting - RAL's experience with Rocky 8

Europe/London
Virtual Only

Virtual Only

Alastair Dewhurst (Science and Technology Facilities Council STFC (GB)), Andrew McNab (University of Manchester), David Colling (Imperial College (GB))
Description

Weekly meeting slot for technical topics. We will try and focus on one topic per meeting.  We will announce at the Tuesday Ops meeting if this meeting is going ahead and if so the topic to be discussed.

Videoconference
GridPP Technical Meeting
Zoom Meeting ID
95305495507
Host
Alastair Dewhurst
Alternative hosts
Samuel Cadellin Skipsey, Matthew Steven Doidge
Useful links
Join via phone
Zoom URL

Notes from the Questions - rocky8 meeting 28/10/22

Alex from cern asks what led to decision of Rocky
Tom - not much in it, at decision time Rocky was slightly ahead.
Mike brings up build system as a factor (Rocky open source, alma wasn't)
Alex from cern - alma build system moving to open source.

Rocky 9 - moving away from koji, which is a shame

Rob - using alma 8/9 on infrastructure happily (decision was "toss a coin")
Asks questions if thought of moving to mainline kernel builds?
Tom - we already do - move to 5.4 during build (for improved memory management)
Rob- big on pushing on LT kernel

Sam - asks about Python changes to scripts.
Tom - Scripts were written many years ago, not kept up to date (e.g. print statements without brackets)
python2to3 scripts might have fixed 90% of the problems.
Also python3 is stricter, so might have more problems.

python and dnf issues (from slides) biggest issues, which is a good thing.

Alex from CERN - asks if have any gating/testing process for new Rocky release?
Tom - there is testing (notes containerised working environment). Bit manual at the moment.
For host OS we use dnf version locking on all rpmsand aquilon snapshotting. Tested on a small portion, then pushed out.
Issues do sometimes slip through the net.
Also Tom notes that only use local repo mirror for extra control.


Dan T in chat: 1) what issues stoping VOs to runing directly  on RHEL(clone)8? 2)What's stoping us going straight RHEL(clone) 9?

AF - skipping RHEL8, and running in container. Suggestion for sites to skip to RHEL9

will this be discussed at the wlcg meeting?
AF - probably

Some disucssion of "missed boat" of Rocky9 for those having to move this year.

Then talk of move from rhel8 -> rhel9
For some this is generally easier, but rhel9 drops a number of compatability things (e.g. iptables)

Rob notes some surprising gotchas - e.g. xfs filesystem problems on VM. Subtlties everywhere.

Alex notes hepix session on Monday:
https://indico.cern.ch/event/1200682/timetable/#20221031.detailed

Sam asks about moving storage estate.
Tom- this is still in discussion. But has a feeling this is going to be Rocky 9.
Early days yet though.

Likely process will be a slow migration to a fresh cluster. But this is a guest.

Some discussion on the delicate upgrading of OS on ceph storage.

Note that nordugrid only just recently started supporting EL8, and nothing for EL9 yet.
(can compile your own, but that's sub-optimal)

Tom B notes that adding containers in house is a non-trivial effort.

AF notes that middleware starting to provide containers.
(Tom notes condor and nordugrid have images)

AF thinks in future we should containerise more and more.
(e.g. xrootd)

Definite discussions at RAL on more wide spread containerisation use.

Dan T - asks if rhel9 containers running on 8 would have any issues?
Tom - only some basic testing, but all examples tested so far seem perfectly functional.

AF - 8 has more use of network spaces that might have security issues (as we curently disable network namespaces as default)


Close.
 

There are minutes attached to this event. Show them.
    • 11:00 11:15
      RAL's Experience with Rocky 8 15m
      Speaker: Thomas Birkett
    • 11:15 11:45
      Discussion 30m
    • 11:45 12:00
      AoB 15m