FTS3 - Updates, status and roadmap

Europe/Zurich
28/R-015 (CERN)

28/R-015

CERN

15
Show room on map
Alejandro Alvarez Ayllon (CERN)
  • Reminder on SOAP deprecation plans
    • LHCb and CMS still relying on it
    • No complains in any case with the proposed steps and timing (see slides)
  • Resolved to establish periodic FTS3 Steering Meetings
    • One slot per month, only if needed
    • Open agenda, to be filled as necessary
    • Thematic, so parties can know in advance if they are interested
    • Need to decide the slot, so it doesn't clash
    • Next main topic, probably deployment and configuration
  • Gone through Atlas standing issues (PDF attached to the agenda)
    • 'Phantom jobs' (job-ids for jobs that failed to commit to the database) are better than 'invisible jobs' (jobs that went to the database, but the client failed to get the job-id)
      • FTS3 Dev Team to figure out if it is possible to catch a connection closed from the client
      • If so, rollback job if the job-id couldn't be returned
      • In any case, there is a prototype implementation in 3.4.0 that allows retrieving non-terminal jobs for a given destination surl
    • Duplicated messages for transfers
      • Two known causes still there:
        • Duplicated transfers (candidate fix in the Pilot@CERN, FTS-336)
        • False stalled transfers (still to decide the approach to fix it)
    • Monitoring view grouped by src-dst-activity
    • FTS set priority
    • Recoverable error field to be exposed
  • Two items brought to attention by LHCb
    • "Error 500" when connecting to the REST interface, although the situation is now better (GGUS #117206)
    • Reminded that FTS3 service team should announce downtimes in the GOCDB
There are minutes attached to this event. Show them.