BiLD-Dev
Bi-Weekly "Loyal" DIRAC developers meeting. And, following, the LHCbDIRAC developers meeting.
Join Zoom Meeting
https://cern.zoom.us/j/91857549918?pwd=S2c1ZmFpYVNDOGk4YjMrZmhOeGROUT09
Meeting ID: 918 5754 9918
One tap mobile
+41315280988,,91857549918# Switzerland
+41432107042,,91857549918# Switzerland
Dial by your location
+41 31 528 09 88 Switzerland
+41 43 210 70 42 Switzerland
+41 43 210 71 08 Switzerland
+33 1 7037 2246 France
+33 1 7037 9729 France
+33 1 8699 5831 France
Meeting ID: 918 5754 9918
Find your local number: https://cern.zoom.us/u/aUn2ex94J
Join by SIP
91857549918@188.184.110.70
91857549918@188.184.89.188
Join by H.323
188.184.110.70
188.184.89.188
Meeting ID: 918 5754 9918
BiLD (Bi-weekly DIRAC Development meeting) – 19/11/2020
At CERN: Again, nobody :/
On Zoom: Federico, Alexandre, André, Andrei, Andrii, Cedric, Christophe, Christopher, Daniela, Hideki, Igor, Janusz, Marko, Simon, Ueda, Vladimir
Apologies:
Follow-up from previous meeting
-
Ran hackathon last week on v7r2-pre20: https://trello.com/b/GXUqWLnK/v7r2-pre20
- added CompressJDLs option in JobDB
- no issues
- added EnableActivityMonitoring option to REA (using MonitoringSystem)
- but we didn’t try it out
- env variables in extra_bashrc:
- removed M2CRYPTO flag (all OK)
- added DIRAC_USE_JSON_ENCODE=Yes (NOT OK, had to be removed – not ready yet)
- Several issues but nothing major
- created several bulky PRs, some already merged
- https functionalities not yet tried out
- added CompressJDLs option in JobDB
-
Last week topic: DIRAC-Rucio integration
- General consensus: using python bindings instead of REST looks like a better option
- Belle2: main issue was mapping the Rucio scopes to the namespace
- Discussion on how to configure Rucio (is the
rucio.cfgfile needed or not?)- NO real conclusion at that time.
DIRAC communities roundtable
GridPP:
Daniela
- Now running v7r1p17 with v7r1p19 pilot (for pilot.cfg fix for cloud and singularity)
- One of our users almost immediately realized his argument list was becoming corrupted. Simon summarized it here: https://github.com/DIRACGrid/Pilot/issues/120
- Federico I will take care ASAP
- CheckPlatform in the SiteDirector seems to be broken and we couldn’t figure out how to make it work: created https://github.com/DIRACGrid/DIRAC/issues/4835.
- Federico this would try to limit sending pilots to those CEs that show the matching platform. This basically relies on what is in the CS, which normally comes from BDII. We disabled this in LHCb.
- Vladimir what is in BDII is usually crap, so it’s better to not use
- Firewall issues from Tokyo uni? Not sure. Will forward to the list.
CLIC:
André
- Still v6r22
- Added ignorelist to the monitoring Agent in iLCDirac
- Ready for moving to DIRAC
- REA sometimes get stuck, not sure if caused by the “ArchiveFiles” or “CheckMigration” (to tape) operations (or another), or just by itself
- Christophe check for # of running threads. There’s maybe a bug but difficult to find. Might come even from the interaction with gfal2.
- Bug in Glue2 I use in v6r22 (backport’ish), not treating multiple queues for CEs, maybe not an issue for v7r0
LHCb:
Federico
- Still on v7r0. Hackathon just yesterday, still showing a couple issues in DIRAC:
- RSS (“usual”
VOthing, maybe fixed in https://github.com/DIRACGrid/DIRAC/pull/4834) - Singularity issues, not fully investigated
- RSS (“usual”
- Fully using M2Crypto (now also on CVMFS client – so also on most of the pilots).
EGI:
Andrei
- Running v7r0, still pilot2
- Tried to switch on M2Crypto, problem with users using pyGSI client
- Christophe this is surprising. The 2 can work together without issues
- Problem with host certificate: make sure to follow https://dirac.readthedocs.io/en/latest/AdministratorGuide/ServerInstallations/InstallingDiracServer.html#server-certificates
- 1 community with metadata directory catalog with wildcards – code will be added
- 1 community with large file system to be imported in the file catalog
- need a script (Daniela has one)
- Federico add it in management
- all services early next year will be migrated to France
France Grilles:
Andrei
- NTR
Belle2:
Hideki
- v6r22
- started gradual migration to DIRACOS
- so far no problem
JINR
Igor
- NTR
Juno, BES3:
- NTR
DIRAC releases
- v7r0p40:
- Nothing new, just docs
- v7r1p18 + v7r1p19:
- RSS fixes for
VOcolumn (not over! LHCbDIRAC hackathon showed more issues, draft PR open) - SingularityCE fix for pilot3 not over! LHCbDIRAC hackathon showed more issues)
- BDII2CSAgent: stop looking for SEs in BDII
- plus other fixes for Glue2
- RSS fixes for
- v7r2-pre21:
- dirac-install with diracos: remove TERMINFO, RRD_DEFAULT_FONT, GFAL and ARC paths from bashrc
- Late import of MonitoringReport
- JobWrapperTemplate kills the JobWrapper in case of Exception during Execution phase
- Remove MultiProcessorSiteDirector (use standard SiteDirector)
- Stop setting JDL requirements MaxCPUTime, SubmitPools, GridRequiredCEs, Origin, JobMode
DIRAC
PRs
- nothing specifically discussed
Issues
- v7r0:
- Nothing in there
- v7r1:
- Several tasks in there still, some are for documentation on Andrei and Andrii
- v7r2:
- mostly non-urgent
- v7r3:
- Not discussed
WebApp:
PRs
With the 2 following open PRs, there will be no DIRAC code needed for uploading files to the DIRAC Web portal, as everything will be managed by nginx + webdav plugin
Issues
- (Make web.cfg optional, move its content to dirac.cfg)[https://github.com/DIRACGrid/WebAppDIRAC/issues/368] should be solved by opened PRs
Pilot3:
PRs
- just one draft PR for binding a WN to a VO (for VMs and multi-VO environment)
Issues
pilot.cfgadded everywhere: https://github.com/DIRACGrid/Pilot/issues/120
DIRACOS:
- xroot5? no news
ldap3added in v1r16, which also includestornado- seems fine
DIRACOS2:
- still need to change
dirac-installfor update the link: will be done this week - conda-forge might drop SLC6
- should be fine
Issues
- André looked into it, asked for more doc
VMDIRAC:
- Maintainer? Probably Andrei, unless Simon wants to take care
- Igor
disableWatchdogChecksdoes not work on VMs?dirac-installshould be downloaded during the bootstrap process- in theory, we don’t see why it shouldn’t, unless there’s a problem with Pilot3
- Many installations use VMDIRAC
- Pilot3? not sure
Documentation:
- NTR
OAuth2:
- WIP
tornado/HTTPs
- NTR
management
- Andrei should finish the PRs for the deployment
- PR: Conda recipes for DIRACOS2
diraccfg
- NTR
other externals, including Rucio
PRs
-
1 PR submitted: https://github.com/DIRACGrid/DIRAC/pull/4811 by Cedric
- Janusz has coded https://github.com/martynia/DIRAC/pull/12 and the two should be compared
-
Janusz license statement should be added. I didn’t have time to look in Cedric’s code. If we are using REST we don’t need rucio.cfg, while we do need it if using the python APIs.
- Christopher open a issue to rucio to have different ways of passing the config
- Cedric I will discuss that with the rucio developer
Release planning, tests and certification
-
Certification machine updated to be multi-VO (task: https://github.com/DIRACGrid/DIRAC/issues/4631)
- NNTR
-
Next hackathon November 26th
- Already called in https://indico.cern.ch/event/976175/
Weekly development(s) focus
- Environment Isolation
- Where do we stand?
- Should check by the next BiLD, right now we the situation improved a lot.
AOB and topics from Google forum
Next hackathon in 1 week.
Next BiLD in 2 weeks.
LHCbDIRAC
- Waiting for RSS fix on v7r1 before releasing v10r1.
- Christophe I will circulate a document for changes for the LHCbDIRAC certification machine: https://codimd.web.cern.ch/2es8HPUcSmS42jrTxhX3JQ?both
- new VO box? not yet, first we finish v10r1