BiLD-Dev
BiLD minutes
# Bi-weekly DIRAC Development meeting – 30/04/2020
At CERN: Nobody, of course!
On Vidyo: Federico, Andre, Alexandre, Andrii, Andrei, Christpher, Christophe, Daniela, Hideki, Janusz, Marko, Simon, Zoltan
Apologies:
Follow-up from previous meeting
- python3: PR submitted for linting directory-by-directory
DIRAC communities roundtable
GridPP:
DFC: permissions checked at each directory level. If the LFN starts with a double slash, the DFC goes into an infinite loop (v6r22pXX, most probably not fixed in later versions, but to be re-checked).
Investigate on new installation getting broken…? Not clear what was the issue.
CLIC:
- File Copy issue from castor to eos
- The workaround in DIRAC needed for bug in xrootd should be fixed in xroot 5
- https://github.com/DIRACGrid/DIRAC/blob/13678e875138599a43f8898b63a91970f5a8b7f4/Resources/Storage/GFAL2_XROOTStorage.py#L81-L99
- https://github.com/xrootd/xrootd/issues/992
- (Not just for this issue) would be good to test Xroot5 RC2 https://github.com/DIRACGrid/DIRACOS/issues/136
- (iLC)Dirac with DiracOS v1r11: can use emacs after sourcing bashrc
LHCb:
v7r0 is in production since Monday, also with Pilot3. Pilot files taken from the web portal: not the most ideal of solutions, but looks like working fine.
France Grilles:
NTR
EGI:
The connection to OSG is working fine. COVID-19 researches with that, running smoothly.
Belle2:
Migration to v6r22. Long certification process of Belle2 is slowing things down there.
Nica:
DIRAC v7r0p20 in production. Environment isolation issues -> suggested to move to v7r0p22 as it at least uses DIRACOS v1r11.
Some issues with EOS based WebServers -> suggested to open tickets and put LHCb crew in CC for next cases.
Current situation
DIRAC
- v6r22:
- v6r22p26 was buggy (issue with TS)
- v6r22p27 has been created that fixes the above and introduce efficiency plots.
- v7r0:
- v7r0p22: only documentation added
- uses DIRACOSv1r11
- v7r0p22: only documentation added
- v7r1:
- v7r1 created, but not announced yet, it will be done today.
WebApp:
NTR
Pilot3:
NTR
DIRACOS:
v1r11 included in DIRAC v7r1 and v7r0p22+
VM:
NTR
Documentation
diracdoctools made python3 compatible.
Slowly moving documentation for options of services and agents in the ConfigTemplate.cfg. Not touching FrameworkSystem right now.
OAuth2:
Main topic for v7r2. The extension will be merged in DIRAC.
tornado, M2Crypto and other externals
NTR
management
NTR
diraccfg
NTR
Release planning, tests and certification
Release planning:
- v6r22 series
- NTR
- v7r0 series
- NTR
- v7r1 series
- NTR
- v7r2 series
- see below
Certification process:
- Jenkins instance: re-enable tests for JSON encoding
- Use MySQL 8 for DIRAC certification
- ES version for DIRAC certification: still stick to 6
Weekly development(s) focus
v7r2 content:
Already there, or PRs ready:
- GSOC students work:
- already merged:
- RSS changes (database level). Main PR: https://github.com/DIRACGrid/DIRAC/pull/4121
- Add ES monitoring support to service and agent. Main PR: https://github.com/DIRACGrid/DIRAC/pull/4120
- PRs:
- https://github.com/DIRACGrid/DIRAC/pull/4221 (performance tests with Locust and Taurus) can be merged
- https://github.com/DIRACGrid/DIRAC/pull/4209 (Add ES Monitoring support to RMS) is probably still valid. The PR should be closed and re-done (Chris?), keeping the user’s commits
- already merged:
- remove pyGSI (from DIRAC)
- we can leave it in DIRACOS
- add Histogram support (https://github.com/DIRACGrid/DIRAC/pull/4497)
- MySQL 8 full support
- ThreadPool - replaced by ThreadPoolExecutor --> flip the flag, make TheadPoolExecutor the default
Requiring work:
-
merge OAuth2 in DIRAC
- include Web
-
JSON encoding
-
Remove Pilot2 code
-
Part of WebApp Core code needs to be moved to DIRAC
- Chris: why not pushing instead for native https in DIRAC?
- Andrei: no problem with that, but we need something woking right now
- Chris: I will work on making a tech preview
- Chris: why not pushing instead for native https in DIRAC?
DIRAC: current PRs and tasks being worked on, or topics from Google forum
RucioFileCatalog: there is pre-production version of the catalog provided and tested by Belle2.
https://github.com/DIRACGrid/DIRAC/issues/4544: TimeLeft and allocation of jobs, mostly for HPCs, long and technical discussion.
https://github.com/DIRACGrid/DIRAC/issues/4574 : what we want from the plotting (discussion thread)
AOB
This might be the last BiLD meeting for Zoltan, who is leaving LHCb and CERN after more than 10 years in the team. Thank Zoltan for your great work and commitment, we hope to see you soon again.
Next week: first hackathon for v7r2? Maybe a bit premature, we will see in the next days.
Next BiLD in 2 weeks.
LHCbDIRAC
After the release:
- Pilot3 files
- maybe would be better to not use the WebPortal at all to avoid the snowball effect.
- S3 and EOS are the obvious candidates.
- probably should be done only in an operational way, no code changes needed.
- CPUtime estimation wrong?
- certainly it changed (see thread with Philippe)
- non-root files for user jobs: meaningless error, will be hidden.
- input sandboxes
- locality problem for user jobs
Operations:
- start enabling M2Crypto on some machine [Christophe will enable it on duplicated services]
Bkk:
- New prodrunview table updates every 10 minutes by Oracle job. Replaces the materialized view.
- Database is cleaned now.