* Current RC version
* Discuss versioning schema
* Discuss release and deployment pace
* Future goals
- Short term
- Medium and long term
* "Help Wanted"
- "Ops" discussion
Demo of the first version of the Web UI configuration interface
more information
16:00
→
16:45
Discussion on requirements, standing issues and priorities45m
Time allocation for open discussions, specially experiment issues, request, future plans...
ATLAS topics15m
Transactional submission
Rucio state machine: 1) QUEUED → 2) SUBMITTING → 3) FTS Submission → 4) SUBMITTED (update of rucio db with the FTS jobid) etc….
If something happens between 3) and 4) like FTS timeout, Rucio db issues, etc we don’t know if a job has been successfully submitted to FTS. For the recovery options:
id that we can pass at submission time and use to check if the job is there or not
Way to check that a job has been submitted for a destination url(s)
get error information from FTS with messages and by polling:
Recoverable error or not
source or destination error
Possibility to provide regexps to classify the error.
Multiple sources in one bulk job
Today we submit one multi source transfer per job
Issue with messages confirmed yesterday.
Wrong FAILED ↔ SUCCESS transition and race conditions
We don’t consume messages anymore and only poll which is less performant and put more load.
Many errors when writing to tape (TRIUMF) - why optimizer does not back off?
Scalability Performances issue when getting all requests done for the last 2 hours ?
1 hour to get more than 10k requests for the last 2 hours. All FTS servers.