BiLD-Dev

Europe/Zurich
Federico Stagni (Conseil Europeen Recherche Nucl. (CERN)-Unknown-Unknown)
Description

Bi-Weekly "Loyal" DIRAC developers meeting. And, following, the LHCbDIRAC developers meeting.

Join Zoom Meeting
https://cern.zoom.us/j/91857549918?pwd=S2c1ZmFpYVNDOGk4YjMrZmhOeGROUT09

Meeting ID: 918 5754 9918
One tap mobile
+41315280988,,91857549918# Switzerland
+41432107042,,91857549918# Switzerland

Dial by your location
        +41 31 528 09 88 Switzerland
        +41 43 210 70 42 Switzerland
        +41 43 210 71 08 Switzerland
        +33 1 7037 2246 France
        +33 1 7037 9729 France
        +33 1 8699 5831 France
Meeting ID: 918 5754 9918
Find your local number: https://cern.zoom.us/u/aUn2ex94J

Join by SIP
91857549918@188.184.110.70
91857549918@188.184.89.188

Join by H.323
188.184.110.70
188.184.89.188
Meeting ID: 918 5754 9918
 

BiLD (Bi-weekly DIRAC Development meeting) – 19/11/2020

At CERN: Again, nobody :/
On Zoom: Federico, Alexandre, André, Andrei, Andrii, Cedric, Christophe, Christopher, Daniela, Hideki, Igor, Janusz, Marko, Simon, Ueda, Vladimir

Apologies:

Follow-up from previous meeting

  • Ran hackathon last week on v7r2-pre20: https://trello.com/b/GXUqWLnK/v7r2-pre20

    • added CompressJDLs option in JobDB
      • no issues
    • added EnableActivityMonitoring option to REA (using MonitoringSystem)
      • but we didn’t try it out
    • env variables in extra_bashrc:
      • removed M2CRYPTO flag (all OK)
      • added DIRAC_USE_JSON_ENCODE=Yes (NOT OK, had to be removed – not ready yet)
    • Several issues but nothing major
      • created several bulky PRs, some already merged
    • https functionalities not yet tried out
  • Last week topic: DIRAC-Rucio integration

    • General consensus: using python bindings instead of REST looks like a better option
    • Belle2: main issue was mapping the Rucio scopes to the namespace
    • Discussion on how to configure Rucio (is the rucio.cfg file needed or not?)
      • NO real conclusion at that time.

DIRAC communities roundtable

GridPP:

 Daniela

  • Now running v7r1p17 with v7r1p19 pilot (for pilot.cfg fix for cloud and singularity)
  • One of our users almost immediately realized his argument list was becoming corrupted. Simon summarized it here: https://github.com/DIRACGrid/Pilot/issues/120
    •  Federico I will take care ASAP
  • CheckPlatform in the SiteDirector seems to be broken and we couldn’t figure out how to make it work: created https://github.com/DIRACGrid/DIRAC/issues/4835.
    •  Federico this would try to limit sending pilots to those CEs that show the matching platform. This basically relies on what is in the CS, which normally comes from BDII. We disabled this in LHCb.
    •  Vladimir what is in BDII is usually crap, so it’s better to not use
  • Firewall issues from Tokyo uni? Not sure. Will forward to the list.

CLIC:

 André

  • Still v6r22
  • Added ignorelist to the monitoring Agent in iLCDirac
    • Ready for moving to DIRAC
    • REA sometimes get stuck, not sure if caused by the “ArchiveFiles” or “CheckMigration” (to tape) operations (or another), or just by itself
      •  Christophe check for # of running threads. There’s maybe a bug but difficult to find. Might come even from the interaction with gfal2.
  • Bug in Glue2 I use in v6r22 (backport’ish), not treating multiple queues for CEs, maybe not an issue for v7r0

LHCb:

 Federico

  • Still on v7r0. Hackathon just yesterday, still showing a couple issues in DIRAC:
  • Fully using M2Crypto (now also on CVMFS client – so also on most of the pilots).

EGI:

 Andrei

France Grilles:

 Andrei

  • NTR

Belle2:

 Hideki

  • v6r22
  • started gradual migration to DIRACOS
    • so far no problem

JINR

 Igor

  • NTR

Juno, BES3:

  • NTR

DIRAC releases

  • v7r0p40:
    • Nothing new, just docs
  • v7r1p18 + v7r1p19:
    • RSS fixes for VO column (not over! LHCbDIRAC hackathon showed more issues, draft PR open)
    • SingularityCE fix for pilot3 not over! LHCbDIRAC hackathon showed more issues)
    • BDII2CSAgent: stop looking for SEs in BDII
      • plus other fixes for Glue2
  • v7r2-pre21:
    • dirac-install with diracos: remove TERMINFO, RRD_DEFAULT_FONT, GFAL and ARC paths from bashrc
    • Late import of MonitoringReport
    • JobWrapperTemplate kills the JobWrapper in case of Exception during Execution phase
    • Remove MultiProcessorSiteDirector (use standard SiteDirector)
    • Stop setting JDL requirements MaxCPUTime, SubmitPools, GridRequiredCEs, Origin, JobMode

DIRAC

PRs

  • nothing specifically discussed

Issues

  • v7r0:
    • Nothing in there
  • v7r1:
    • Several tasks in there still, some are for documentation on Andrei and Andrii
  • v7r2:
    • mostly non-urgent
  • v7r3:
    • Not discussed

WebApp:

PRs

With the 2 following open PRs, there will be no DIRAC code needed for uploading files to the DIRAC Web portal, as everything will be managed by nginx + webdav plugin

Issues

Pilot3:

PRs

  • just one draft PR for binding a WN to a VO (for VMs and multi-VO environment)

Issues

DIRACOS:

  • xroot5? no news
  • ldap3 added in v1r16, which also includes tornado
    • seems fine

DIRACOS2:

  • still need to change dirac-install for update the link: will be done this week
  • conda-forge might drop SLC6
    • should be fine

Issues

VMDIRAC:

  • Maintainer? Probably Andrei, unless Simon wants to take care
  •  Igor disableWatchdogChecks does not work on VMs?
    • dirac-install should be downloaded during the bootstrap process
    • in theory, we don’t see why it shouldn’t, unless there’s a problem with Pilot3
  • Many installations use VMDIRAC
  • Pilot3? not sure

Documentation:

  • NTR

OAuth2:

  • WIP

tornado/HTTPs

  • NTR

management

diraccfg

  • NTR

other externals, including Rucio

PRs

  • 1 PR submitted: https://github.com/DIRACGrid/DIRAC/pull/4811 by Cedric

  •  Janusz license statement should be added. I didn’t have time to look in Cedric’s code. If we are using REST we don’t need rucio.cfg, while we do need it if using the python APIs.

    •  Christopher open a issue to rucio to have different ways of passing the config
    •  Cedric I will discuss that with the rucio developer

Release planning, tests and certification

Weekly development(s) focus

  • Environment Isolation
    • Where do we stand?
    • Should check by the next BiLD, right now we the situation improved a lot.

AOB and topics from Google forum

Next hackathon in 1 week.
Next BiLD in 2 weeks.


LHCbDIRAC

There are minutes attached to this event. Show them.
    • 10:00 10:10
      Items from Previous BiLD-Dev 10m
    • 10:10 10:20
      DIRAC Communities roundtable 10m
    • 10:20 10:35
      Current situation 15m
      • DIRAC
      • WebApp
      • Pilot
      • DIRACOS
      • DIRACOS2
      • VMDIRAC
      • Documentation
      • OAuth2
      • tornado/HTTPs
      • other externals (include Rucio)
    • 10:35 10:50
      Release planning, tests and certification 15m
    • 10:50 11:10
      Weekly development(s) focus 20m

      DIRAC-Rucio integration

      Speakers: Cedric Serfon (Brookhaven National Laboratory (US)), Janusz Martyniak
    • 11:10 11:20
      DIRAC: current PRs and tasks being worked on 10m

      Ongoing PRs
      - v6r22 PRs
      - v7r0 PRs
      - v7r1 PRs
      - v7r2 PRs
      Ongoing tasks
      - ?
      Topics from the google forum
      - ?

    • 11:20 11:30
      AOB
      Convener: Federico Stagni (CERN)
    • 11:30 11:45
      LHCbDIRAC 15m