DOMA / TPC Meeting

Europe/Zurich
Videoconference Rooms
WLCG_DOMA
Name
WLCG_DOMA
Description
WLCG DOMA meeting
Extension
10729930
Owner
Simone Campana
Auto-join URL
Useful links
Phone numbers
  • Attending: Brian, Wei, Andy, Paul, Fabrizio, Al, Ian, Lucia, Dmitry, Petr, Oliver
  • XRootD Protocol updates (Wei):
    • X509 TPC works with most storage systems in the current release.
    • Exception is EOS.  Not clear how they plan to implement this currently (may require them to deploy dedicated servers).
    • Found a bug in the VOMS plugin.
    • Fabrizio: Bug was revealed due to crash in DPM testbeds.   Seems to be a conflict with XrdHttp?  Wei: new VOMS plugin utilized new set of APIs in OpenSSL; crash goes away when XrdHttp is loaded or when older VOMS plugin is utilized.
      • Production VOMS plugin is 0.3; new one (broken) is 0.6.
      • Andy and Fabrizio will look into this.
      • Brian: Can we file a bug report since we have a stack trace?
    • Katy: Have rolled out a new test machine on the cluster; things are working well so far.  Would like to change the tests to get more sites to test against that, gain confidence before rolling out everywhere.
      • ACTION ITEM (Brian): Update Rucio configs for new gateway hostname.
    • Al:
      • All things are ready for releasing Xrootd-based delegation for TPC in dCache 5.2.  5.2 should be out in a "matter of weeks"  (1 July?).
      • smoke-tests for Xrootd (written in python) is getting into shape.  Will be out next week, but will come back to this subsequently.
    • Fabrizio: There's a problem in the heatmap currently.  Certificate used by Rucio had an expired VOMS extension?
      • Brian: Way back when, OSG didn't require VOMS verification (extension was "verified" separately by downloading a list of groups directly from VOMS Admin).  Maybe that's happening here?
      • Paul: Still looking at this, but it's possible the proxy itself is valid and the VOMS extension is expired.  The proxy could then get a new VOMS extension (valid), resulting in a chain with one VOMS extension that is invalid and one that is valid.
      • Several theories as to why this is - but will need to debug via email.
    • Wei: Ale and Mario are putting together a stress test inside ATLAS.  More to report later. Tim: Due to how it integrates with AGIS / missing functionality in Rucio, this will be a dedicated stress test for ATLAS.
    • Petr: GFAL stopped working recently, but xrdcp directly appears to be fine.  When run with debugging, GFAL appears to attempt to delegate.  Wei has some ideas - GFAL is perhaps setting the env var too late?
      • Tim requests that we post the instructions for doing xrootd TPC with GFAL.
      • Petr: GFAL appears to work with dCache (built by hand with recent-ish trunk) but not with DPM.  Al: maybe this is because dCache is doing some sort of fallback?
  • HTTP updates:
    • Minor fixes to the smoke-tests.  Fix-up for RHEL7 compat with curl.  Brian needs to resolve merge conflicts for the scitokens PR now.
    • dCache SciTokens support appears to work fine with the (unmerged) PR.
    • ACTION ITEM: Get status update from EOS.
    • Caltech joined in the smoke tests but not passing yet (macaroon-related; unclear what the issue is).
      • Purdue, Wisconsin, & UCSD are waiting in the wings.
    • Nebraska has HTTP transfers working in PhEDEx, but there aren't many sources in CMS for HTTP (few sites touch PhEDEx anymore).
    • Paul: DESY is working to roll out HTTP TPC support for production instances.  3 July planned for ATLAS instance.
      • ACTION ITEM (Brian): Follow up with DESY CMS contact to export HTTP in PhEDEx
    • ACTION ITEM (Brian): Get info from Mario on what Rucio is lacking to do test transfers.  For protocol-specific links (i.e., only test HTTP on a specified link, not for entire destination), this could be done by the next major release (October?).
    • Paul: KIT joined the test matrix.
    • Paul: TRIUMF DynaFed endpoint is working, but others are still failing tests.
      • ACTION ITEM (Paul): Ping mailing list?
      • Think that RAL should be working?
      • Will be a topic in the July pre-GDB!
There are minutes attached to this event. Show them.
    • 17:30 17:50
      Xrootd Protocol Update 20m
      Speaker: Wei Yang (SLAC National Accelerator Laboratory (US))
    • 17:50 18:10
      HTTP Protocol Update 20m
      Speaker: Brian Paul Bockelman (University of Nebraska Lincoln (US))
    • 18:10 18:30
      Discussion 20m