Zoom link in announcement email; please contact rootdev@cern.ch if you did not receive it!
Axel --> Lorenzo
New machine ntplperf02
:
* AMD CPU with extra large cache
* 100 GBit ethernet card
Not able to reach 100Gbit yet, working on it.
Discussion with Openlab, Google, Microsoft. Suggested 2 fellows each, for cloud-based analysis. Companies came back with a lower budget. Now, there are meetings at SC conference with the companies specifically about this. There are chances that this might actually fly.
There was an architects forum, request for 6.30, which was released in the meanwhile.
No other meetings to report, there was an RNTuple workshop with experiments. We foresee having a "digestion" session during this week's PPP slot.
First release with the new CI. It is supposed to generate binaries for the new platforms. There was a bug, added a commit and tagged as 6.30/00a
. That is enough for us to release with the correct 6.30
tag. At least the experiments can already use the tag anyway.
Also 6.28.10 is quite important because it unblocks people with latest MacOS.
Q: How can we make the release procedure easier so that more people can do it?
A: Currently there's a lot of manual work involved. Moving to GitHub means we can add more automatic actions that can be run by everybody. Creating the release binaries is also way simpler.
We have self-hosted runners at CERN. Platform-dependent, a bit convoluted.
There's a repository under root-project
on GH with the definition for all Linux distros.
In the Dockerfile of each image, we install system packages required for building ROOT, together with Python packages required at runtime for tests. There are two requirements.txt
files, one in root
and one in roottest
.
Q: We do this for all platforms?
A: We only have x86 64bit, no 32bit, no ARM. We could also add these to our CI, then need to understand if e.g. a fedora39-arm
Dockerfile would be needed or not.
Images are built daily by a GH workflow, artifacts pushed to S3.
Creating a new directory for a new image in the repository doesn't mean that image will be generated, it must be added to the workflow as well.
Images are uploaded to registry.cern.ch, official CERN repository. We retain images for 12 days.
Q: Building images daily means we get new bugs out of distro updates that might be unrelated to ROOT. e.g. the gcc update on Fedora in August which took multiple weeks to fix.
A:
Axel: The users are also afflicted by the same. Maybe we should try to change the way we address these issues.
Vassil: Maybe we should introduce a staging area.
Axel: What the nightly docker image action does is updating the distro's system packages. This kind of update should very rarely break existing packages, including ROOT. More often, a new distro breaks ROOT. For new distros, we first need to create a docker image for it (that's a PR). Then we create a PR against ROOT itself (and we make sure that PR passes tests), so that's already a staging area.
Vassil: I still think that distros sometimes update minor versions of compilers and we are not resilient enough to that. That should be enough to warrant the creation of a staging area for building images.
Q: Who gets informed about failures in nightly builds of the Docker images?
A: Currently "everyone who's subscribed", we need to find a better way to inform the team.
Q: Are these images used for rootbench
as well?
A: Not yet, but we can certainly do that.
Pipelines are now stored in the root
repository rather than in the Jenkins web interface.
Any PR will trigger a check for this branch's file in the on_pull_request
section. For the nightly, it only looks at the master
branch.
Artifacts of the build are pushed to the OpenStack S3. Each CMake invocation (per branch, platform) is hashed. This way, we have a certain ROOT build artifact become the starting step for a subsequent incremental build, if and only if the CMake invocation is the same.
Q: Should we make the S3 bucket more easily browsable through the web interface?
A: In principle this is not something user-facing. Until concrete use cases are presented we are probably good like this.
Q: I would expect that the CI runs with the defaults on all platforms and those work. Then we could have further override options but those should be an extra.
Comment: we should test the cmake
default configuration as cmake determines it. So we avoid testing our CI with configurations that our users won't have. Testing builtins should also be done. This should be driven as a separate discussion on its own.
Comment: Builds on the master branch, if you have a delay between the commit that it's provided in the latest state of the PR and the latest commit on master then we have a problem.
The CI runs upload a comment on the PR. The same comment gets updated whenever a new CI run has finished for the same PR. In the comment, a link to the detailed view of the logs of all the runs can be found (See the this check
wording).
Comment: Can we ask the PR CI run to push the artifacts for the current run somewhere? Currently the only artifacts that are available are from the latest build of master for that particular distro/config.
Comment: I believe it's better to convey information for new contributors to read only "one place" for information about failures rather than multiple. The comment on the PR can be a bit too dispersive.
We also have builds that run irrespective of PRs: nightlies (only on master), on-push builds (on master). The builds on master also update artifacts for incremental builds of PRs.
Comment: PR builds should stop whenever the PR is merged. Currently that's not the case. The same thing happens if there are multiple pushes to the same PR.
Jonas R.: Probably we don't want to build all pushes to the same PR. This avoids extra clutter.
Comment: The nice part is that we manage to incrementally build the CI as part of the repository. This solves some of the synchronization problems we had in the past. The current approach has a limitation that some PRs on master are still currently failing, quite often. This happens even in evidently wrong cases, such as for PRs that only touch documentation. We should aim for a minimal set of features that are solid and known to always succeed, then build from there.