FTS3 Steering

Europe/Zurich
31/S-023 (CERN)

31/S-023

CERN

22
Show room on map
Alejandro Alvarez Ayllon (CERN)
  • Release candidate is 3.5.x. Patch version will depend on the iterations while running on the pilot.

From Atlas document

  • Atlas asked how to figure out the version that is running on a server
    • REST publishes it on the root directory. i.e https://fts3.cern.ch:8446/
      •  "api": {"major": 3, "minor": 4, "patch": 2}
    • Devs should find a way of publishing also the core version [FTS-703]
  • Atlas agreed to move part of the production load to the Pilot service
    • Rather than a one-time only, this load can remain indefinitely, providing invaluable feedback for devs
  • XrdCp transfers from EOS to Castor, used by anybody?
    • Not via FTS, although CMS uses XrdCp directly. Satisfied with the results.
    • The gfal2 xrootd plugin relies on the xrood libraries, so it is mature enough
    • There are some concerns about the suitability of  xrootd outside internal CERN transfers
  • Fair-share: Since FTS schedules per link, and then by activity, some higher priority transfers A ->D can be starved by lower priority transfers from other links (X->D) since they are scheduled first, and exhaust the storage limitation of D
    • Breaking the strict ordering of FTS when scheduling may be enough, and easier to implement [FTS-704]
  • There is a fair amount of small files coming from Atlas. Session reuse will help when switching to GsiFTP only.
    • Can FTS decide when to use it?
    • As of today, session reuse is for the whole job, or nothing.
    • Low hanging fruit: jobs with several small files [FTS-705]
  • Very long term: cross check theoretical bandwidth with achieved throughput. Can this provide feedback for FTS?

Configuration

  • Needs to be armonized, so VOs know what are other doing, what are the values...
    • ​Need to involve all parties (VOs, devs...)
    • Consultancy from devs may be required for setting the values as well
    • Maybe better to iterate that keep discussing
  • Agreed on creating a JIRA project to keep track of why changes are done
    • FTS provides an audit, but not tracking of rationales
    • JIRA for the moment, to consider integration (i.e. automatic integration of ticket creation)
    • [FTS-706]

Other

  • Stalled connections: small improvements at CERN, but not yet 100% solved
    • To notify sites configuration changes required once CERN dissappear from the alerts
    • Disabling Gridsite passcode files and max requests per client seem to help, but only help
  • Deletion:
    • ATLAS, CMS and LHCb do not use, and do not plan to use deletions
    • To be removed [FTS-707]
  • SOAP
    • Only CMS pending migration, but going as planned
    • ~2 months the REST implementation will go to production (this is, ~beginning of November)
    • Calendar is maintained: [FTS-600], [FTS-601]
      • Monitoring being put in place to trace users still using SOAP
      • SOAP can be shutdown progressively before the rollout of 3.6 to detect outliers
  • Downtime may be required for 3.6 and database optimizations
    • No objections by anyone
    • Pilot can be used as pivot
    • No need to drain, but yes to stop submitting
      • Poller may keep running, so either read-only access or 503 statuses need to be returned
    • To discuss the dates. January doesn't seem to fit.

 

There are minutes attached to this event. Show them.