BiLD-Dev
BiLD minutes
# Bi-weekly DIRAC Development meeting – 14/05/2020
At CERN: Nobody, of course!
On Vidyo: Federico, Andrei, Andrii, André, Christopher, Christophe, Igor, Daniela, Hideki, Ueda, Janusz, Alexandre, Marko, Simon, Cedric
Apologies:
Follow-up from previous meeting
- PR for python3 linting directory-by-directory is merged. A txt file in /tests holds the linted directory
- already applied in one PR, works fine
- RucioCatalog: reported a bit premature, Belle2-based
DIRAC communities roundtable
GridPP:
- testing v7r1 (jumping from v6r22, skipping v7r0)
- installation issue: race condition in installation process (issue 4579)
CLIC:
- Started using the dirac puppet module to create new puppet
manifests for new cc7 server to replace the existing infrastructure - No news for copying issue, hoping for XRoot5
LHCb:
- Pilot3: looks fine in prod for 2 weeks with no changes, but we want to host it also on s3 based web
- M2Crypto issues: we enabled it on couple machines. Some issues spotted, partly corrected in a PR that went in DIRAC v7r0p23
- Running on HPCs: CINECA (all “standard” but with fat KNL nodes). SDumont (no CE, fat nodes, limited CPUTime)
France Grilles:
- Running with v7r0, quite stable
- Not yet Pilot3, on top of to-do list
- Looking for updating soon to v7r1
EGI:
- Might merge France Grilles into EGI
- Check-in solution looks like providing all the needed information now
- Being tested by Andrii
Belle2:
- Migration to v6r22
- The dirac-distribution container does not work with the structure of BelleDIRAC, where the web and DIRAC extensions are merged
- should look into code in DIRACGrid/management, probably it’s enough a simple fix.
Nica:
- Not updated to latest DIRAC v7r0pXX version yet
- Run F@H in VMDIRAC jobs
Current situation
DIRAC
- v6r22:
* - v7r0:
- created 2 patches since last BiLD: v7r0p23 and v7r0p24
- Fixed slow submission from CLI
- Fixed some of M2Crypto issues
- Fixed calculation of CPU consumed for the case of multi-step jobs
- Optimizers: check LFN InputSandboxes separately from InputData
- v7r1:
- v7r1p1 and v7r1p2 created, just inheriting changes from v6r22 and v7r0
- v7r2:
WebApp:
NTR
Pilot3:
NTR
DIRACOS:
- v1r12 new patch
- xroot5 not there yet, requested. A bit tricky.
VM:
NTR
Documentation:
Minor changes. Some clarifications needed for using its tool for extension (LHCb tried and showed an issue).
OAuth2:
Being tested in EGI framework.
tornado and other externals
NTR
management
- It could be used to deliver the globalDefaults.cfg to different locations (and dirac-install can be updated accordingly).
- Issues with dirac-distribution and BelleDIRAC releases.
diraccfg
PR created for replacing native DIRAC CFG with diraccfg (targeting integration).
Release planning, tests and certification
Not much done wrt to what was discussed at the last BiLD meeting (which is still valid).
Weekly development(s) focus
DIRAC: current PRs and tasks being worked on, or topics from Google forum
- v7r0:
- NTR
- v7r1:
- NTR
- v7r2:
- Several PRs open, the only one discussed was
[4586]: [v7r2] Add __future__ import for Python 3 like imports, division, and printing
- added these lines to all the files:
The one up to discussion isfrom __future__ import absolute_import from __future__ import division from __future__ import print_function
from __future__ import division
that is somewhat dangerous. It has been pointed out that probably the only real danger may come from plotting and accounting. At the end the conclusion is that we need to bite the bullet anyway.
- Several PRs open, the only one discussed was
Issues
AOB
Next BiLD in 2 weeks.
LHCbDIRAC
- Pilot3 files:
- s3 site ready, changes in puppet ready. Just need to fire.
- CPUtime estimation wrong: waiting for patch
- non-root files for user jobs: meaningless error, will be hidden (waiting for patch)
- input sandboxes locality problem for user jobs: running still with hotfix, waiting for patch
There have been some issues when creating the release, so probably it will be created this afternoon and deployed on Monday.
Start enabling M2Crypto on some machines: some issues, only partly addressed.
BKK: we’ll need to update the password of the production instance (the certification has been done already). Scheduling it for Monday, Federico will take care.