WLCG Traceability and Isolation WG (Vidyo meeting)

31/S-028 (CERN)



Present: Brian Paul Bockelman, Dave Dykstra, David Crooks, Miguel Martinez Pedreira, Mischa Salle, Vincent Brillault

● Welcome and minutes from last meeting

  • No comment on the minutes of the previous meeting
  • Vincent reported that at the last GDB, a new WG was created, the "container" working group, which will take care of the uniformed deployment an isolation solution (e.g. singularity). As a result, this working group will now only focus on traceability

● Compute traceability: artefacts & challenges

Two axis to work on (potentially in parallel):

  • Runtime traceability: finding more details (e.g user identifier) about a running job from artefact left by the pilot/VO framework (file name, log files, etc)
    • Needed for a better challenge capable of validating the VO findings
    • Actions on VOs: identify what is already available and can be collected
  • Offer a possibility for sites to collect these logs (like FNAL for CMS):
    • HTCondorCE has a new feature, which can produce audit logs if the jobs push backs the right information.
      • Dave will share a command line that can be used to produce that data
      • This does not cover all CEs and might only be a gap solution
    • What should be required from VO?
      • First estimation: Start/stop action, user unique identifier (opaque string)
      • More debug information from VOs? (e.g. pilot id, job id for the pilot framework, etc)
      • Any format needed?
    • How to collect it from other CEs, for other VOs?
      • Suggestion welcome!
    • Discussions to be started by email

● Storage traceability

  • Brian presented a pilot using tokens to authenticate and authorize for storage (XROOTD) actions
    • This is based on standard technology and existing work (e.g. auth0)
    • As discussed at previous meetings, this approach is compatible with Alice current system, so a convergence could be possible on the long term
    • As nobody from ATLAS or LHCB was present, there was no feedback from them.
    • This will be presented at the next GDB, at which time it will be possible to collect interest from all participants
  • Brian suggested to run a storage challenge
    • This would be technically more difficult than the simple compute challenge currently discussed
    • To be discussed over mail as what can be tested and how

● Actions & next meeting

  • AOB:
    • Dave noted that the name of the WG was now not in line with the mailing list and the website: Vincent will see what can be done.
  • Actions:
    • VOs: identify and report artifacts that can be used for identifying jobs/users
    • Dave: share commands that can be run to populate the HTCondorCE logs from a running job
    • Vincent start mail discussion on:
      • How pilots can push user/job information to sites? What should be pushed?
      • Ideas on how to do a Storage Challenge
  • Given the lack of progress and activity, no meeting is schedule yet. When the discussion will have progressed enough by email, another meeting should be schedule via foodle
