DOMA / TPC Meeting
Topic: WLCG DOMA TPC Meeting
Join Zoom Meeting
https://cern.zoom.us/j/99836057922?pwd=ZFhWN3NpYi9oZmwvM3pIRE9zdzFnZz09
Meeting ID: 998 3605 7922
Passcode: 733660
One tap mobile
+41315280988,,99836057922# Switzerland
+41432107042,,99836057922# Switzerland
Dial by your location
+41 31 528 09 88 Switzerland
+41 43 210 70 42 Switzerland
+41 43 210 71 08 Switzerland
+33 1 7037 2246 France
+33 1 7037 9729 France
+33 1 8699 5831 France
Meeting ID: 998 3605 7922
Find your local number: https://cern.zoom.us/u/aeB4ArMgmT
SRM - tape T0 - T1 transfers
Paul (with Andrea and Mihai) presented 3 stages possible stages to change SRM authorization mechanism to get to use SRM with HTTP.
- X509 + macaroons will require some reorganisation of the code between gfal and FTS
- X509 + JWT at first sight no development but after further discussion it may still need some
- JWT only requires some major changes.
CTA should work because it is based on xrootd and we have HTTP working on it If we go down the direction of SRM+http do we need to take into account also castor? Agreement that castor shouldn't be touched and we should keep it using gsiftp.
There is no plan to proactively remove gsiftp until we are on more solid ground with other protocols. The plan is to move the infrastructure to use something else but to keep it as backup
Same for SRM in this way we can prolong the SRM klife which is a well known protocol and all T1s know how to deal with it. We can think about removing SRM when we have a better QoS experience.
rucio will need some adjustment too as it assumes gsiftp atm
Action: Mihai will create few slides for next meeting with a plan to enable 1 and 2 in FTS gfal. Andrea volunteered to help with 2. so we have something that can be tested for both setup by the end of the year.
Christophe (LHCb) and Michael (CTA) are on board with the plan.
Experiments production
Petr did some stress testing involving different storages and protocols. two problem highlighted
- dcache interrupts abruptly the transfers when using xrootd, This happens mostly between FZK and US sites, so it might be due to a TCP timeout. There was some discussion about this but it will continue offline.
- Abrubtly interrupted transfer errrors appear also in production transfers for http-tpc and shold be looked at.
- infn-t1 (but not only) has much worst HTTP-TPC transfer rates than gsiftp.
- smoke tests failed when both the storage and the curl client try to use TLS 1.3. For now a work around to use TLS 1.2 has been put in plae, but we have to worry in case client upgrades start causing failures. We need to understand better how we can avoid this. This concerns clients on centos7 and 8. Alessandra didn't notice any problem when running smoke tests because so far she run them against DPM and DPM doesn't use TLS 1.3
- further points in the contribution minutes
Question from Julia for next upgrade for dcache sites which is the dest dcache version to upgrade to? Latest! to avoid bugs already fixed. The choice between 5.2.x and 6.2.x is up to the site.
ATLAS has 20 sites configured as active destinations. They use HTTP with any sites that has it enabled, so the matrix is quite large.
AOB
- WLCG workshop in November will concentrate on storage. There is a call for sites to present their status and plans regarding storage. The aim is to produce a roadmap for storage leading up to HL-LHC. Any volunteer site add themselves in the google doc
- Paul asked is there is a deadline. [info from after the meeting] The plan is to collect talks within the next two weeks