WLCG DOMA BDT Meeting

Europe/Zurich
Brian Paul Bockelman (University of Wisconsin Madison (US)) , Maria Arsuaga Rios (CERN) , Petr Vokac (Czech Technical University in Prague (CZ))
Description

Topic: WLCG DOMA BDT Meeting (twiki)

Videoconference
WLCG DOMA BDT Meeting
Zoom Meeting ID
69074333781
Host
Petr Vokac
Useful links
Join via phone
Zoom URL
    • 4:30 PM 4:45 PM
      FTS + Gfal updates 15m
      Speaker: Mihai Patrascoiu (CERN)

      I. Recent DOMA-BDT discuss a lot "XRootd + Tokens". Clarification needed:

      • Is this about XRootd WebDav interface?
      • Is this via HTTPS or root:// protocol?
      • If via root:// protocol, who is the community behind? (WLCG transfers are 95% over HTTPs)

      II. Transition to tokens and Xrootd:

      • Will need to access data from Storage Endpoints via Gfal2 + tokens?
      • Gfal2 currently does not pass tokens to XRootd

      III. Previous DOMA-BDT mentioned incorporating Gfal2 in the token compliance tests:

      • Which protocols are targeted?


      Answers

      I. Community is any PC running analysis that needs direct data access. Usually done via XRootd

      II. For HTTP, Gfal2 + tokens works. For root:// access, the underlying client can do the token discovery
      III. This is about SAM/ETF tests. Point II applies here as well.

       

      Gfal2 will not improve root:// + token support for the moment.

      • Petr: we will have to restart this discussion once we are closer to provide CLI support for end users
    • 4:45 PM 5:00 PM
      Packet marking 15m
      Speakers: Marian Babik (CERN) , Shawn Mc Kee (University of Michigan (US))

      LHCOPN-LHCONE meeting #49 - CERN, Geneva CH

      https://indico.cern.ch/event/1146558/ (24 Oct 2022, 13:00 → 25 Oct 2022, 17:15, 31/3-004 - IT Amphitheatre CERN)

      WLCG Data Challenges LHCONE/LHCOPN side meeting

      https://indico.cern.ch/event/1212782/ (Wednesday 26 Oct 2022, 13:00 → 15:00, 513/1-024 CERN)

      ... discuss timelines, milestones and possible "mini-challenges" we may want to conduct.   We imagine there are many areas of work in monitoring, tools, applications, storage and networking that may want to have their own milestones and challenges to properly prepare for the next data challenge ...

       

      News:   

      • We have a version of dCache (pool node) that can emit Fireflies (UDP Syslog flow label information) that will be tested this week at AGLT2 (thanks Tigran)
        • special dCache build from developers
        • require updates on pool nodes - not yet ready for official version
      • We are continuing to work on an SC22 demo showing both flow labels and packet marking, as well as the new netlink information

      CMS is enabling tokens for US CMS sites

      • question: would it make sense to deploy packet marking functionality at the same time?
      • still in a testing phase / not yet ready for pre-production
      • start with same sites that were involved in packet marking during DC21
        • Caltech, Nebraska, MWT2, AGLT2
    • 5:00 PM 5:15 PM
      Transfers with tokens 15m
      Speaker: Francesco Giacomini (INFN CNAF)

      Token support for transfers with Rucio & FTS

      Compliance tests

    • 5:15 PM 5:30 PM
      Tape REST access 15m
      Speaker: Mihai PATRASCOIU (CERN)
    • 5:30 PM 5:40 PM
      AOB 10m

      HTTP-TPC multistream

      • Summarize status (support in FTS/gfal2 and individual storage implementations as passive and active HTTP-TPC push&pull)
        • FTS/gfal2 - configured number of streams is passed by X-Number-Of-Streams header to the COPY HTTP request
        • dCache
        • DPM
        • EOS - pull transfers fails with enabled multistream, push ignores(?) multistream
        • Echo
        • StoRM
        • XRootD - pull transfers fails with enabled multistream, push ignores(?) multistream
      • Discussion triggered by GGUS:157985
        • Should we invest time to fix / implement or is single stream acceptable by DOMA?
        • What are the benefits? Could such functionality reduce operational effort?
        • Overloading storage with multiple connections (e.g. dCache movers)?
      • Other parameters ignored by FTS/gfal2 for HTTP-TPC implementation
        • TCP buffer size
        • enforce IPv4 vs. IPv6
        • disable proxy delegation
      • Configuration interface in FTS is same for all protocols
        • it is not clear which features are generally supported / supported by fraction of SE implementation

      Non-standard disknode ports in default configuration (HTTPS)

      • dCache - disknode by default use port range 20000-25000 for all protocols
      • DPM - disknode by default use 443 port for HTTPS, 1095 for xroot and random port for gsiftp
      • Echo - doors use 1094 for HTTPS
      • EOSATLAS - disknode use 8443 for HTTPS
      • StoRM - headnode / webdav doors use 8443 for HTTPS
      • XRootD - use 1094 for HTTPS

      dCache per-pool transfer limit

      • LHCb users observed problems accessing files on dCache GGUS:153653
        • hadd normally opens all files at start and per-pool transfer limit can be reached quite easily
        • can be avoided with maxopenedfiles CLI parameter
      • it should not be so easy to "overload" dCache with number of opened files
        • too "low limit" observed also by ATLAS at some dCache sites
          • jobs using "directio" can keeps (multiple) files open for a long time
          • FZK limit number of movers because of memory usage with xroot protocol
            • dCache can even crash when exhaust dcache.java.memory.direct memory
              • can be triggered by just "one misbehaving" client
            • extensively discussed with dCache developers
        • I would not be surprised to see similar issues once users starts to extensively use e.g. RDataFrame