Token Trust & Traceability WG

Europe/Zurich
513/R-068 (CERN)

513/R-068

CERN

19
Show room on map
Description

Fortnightly for the risk assessment season.

Zoom Meeting ID
64974356171
Host
Matthew Steven Doidge
Useful links
Join via phone
Zoom URL

https://codimd.web.cern.ch/Odu9oM1ZRUGYuThd1V9SIA?view

 

# TTT 16th September 2025

Attending: Matt, Luna, Maarten, Mischa, Linda

Apologies: DavidC, DakeK, DaveD, Marcus

## AOB
(as we might need to wrap up quickly get this done first)


### Next meeting(s)
* Usual time of 15.00 CEST on September 23rd? 
Yes

Added 513/R-068 room to the meetings in indico.

## Last meeting
Went over WF-3, - grid jobs with long lived tokens  
Lifetime scale of a few days. Could be stolen, and the credentials used.  
But the attacker would have to go through hoops to be able to perform abuse.  
Scope of damage?  
Attacker will be unable to delete.

But note for WF-4 CMS are currently have to give storage.modify scope

CMS looking at replacing Glidein WMS with something else, so no changes likely until then.  
Atlas are further along with this.

Need to keep this on the to do list/radar

WF-3 describes what ALICE does and more or less what ATLAS intend to do.

How much is specific to a workflow

Luna suggests that we go to another WF and see if we have any overlap. 

ML- not WF-4 as too similar

WF-1 (Long lived FTS token)  
TR-1 is very generic.  
As long lived tokens are less susceptable to this.

But services are trusted more then others (no actually login to endpoints).

Downtimes double aged. Most concerned with abuses. Service down problem for users, but not risk for us.

ML - Some difference between FTS and grid jobs workflows. Some experiments might use IAM for low rate work, some other issuer for high rate needs. This should be at least mentioned in our risk analysis. New Issuer comes with new risks.

But to first order, requirements to deploy Issuer in a sensible way.

Luna - TR-2 also the same.  
ML - agreed, fallout could be worst for some workflows then others.   Site more concerned about jobs being run then data deleted.

Matt gets the wrong end of the stick, but is corrected.

Luna - but note other issuers increase the attack surface.

ML - need to enhance the sheets.

Luna - TR-[1-3] seem to be generic.

TR-4 This one would be N rows for each workflow. Have own condition/mitigation.

ML - ultimately will combine into one single sheet, but will likely work with smaller sheets.

There will be replicated lines - N times.   
e.g TR-4 a,b,c,d,e,f  
Then come up with potential likelihood/impact, then identify workflows that need special attention.

TR-5
Luna - do we have long lived tokens for submission?  
ML - workload managers have long lived tokens within them, and get long lived tokens to do stuff with data.

Couples with TR-4, identifying weakness and wondering how it can be exploited.

TR-5 isn't idependant, 4 & 5 linked if attacker.

Luna- it is distinct if it's a user committing abuse. 
Very much workflow dependent 

ML - hope it doesn't need to be duplicated N time as well

Some thoughts on VOBoxes. Is an additional risk, service at a site, loginable by privileged people and could be in a sensitive network permissions.  
They probably deserve an explicit entry. Other VOBox-like services are spreading out.

ML - TR-5 will need to be split

TR-6 seems to be universal, so doesn't need to be split

TR-7 is the same. Data specific, but again universal.

So only need to split out TR-4 and 5, tripling the rows.
Add a,b,c,d,e,f

Change workflows from numbers to letters.

ML Contemplating renumbering risks, have 4 and 5 at the bottom. And perhaps rethink the natural order.

TR1- okay  
2 & 3 okay too  
Move 4 & 5  
Maybe swap 7 and 6?  

If we're lucky we might be able to optimise the split of 4 & 5, but that's a maybe, as rogue users and token theft are different.

Luna - we can merge them later if it looks like we can.

ML - see if we end up doing a lot of copy pasting, a sign of unneeded duplication.

Action - Matt will build the new spreadsheet, or at least a scratch version thereof.

ML - next week might be able to start the process. For some rows we might not be able to say much due to some issues, but hopefully identify some areas that need attention. Minimum outcome of this exercise - identify areas that people might assume are under control.

 

 

## Risk Scoring Review
* Look again at the "generic" scoring method
--  Decision from last time not to consider reputational damage as a factor

Look at the scoring tables next week whilst we attempt the exercise, and see which table works best.

 

## Actions
Matt to rearrange the spreadsheet as discussed, ready for us to run the exercise next week. Share with the list ahead of time.

* Add subsections to what is now TR-4 and 5, labeled a-f
* Move subsection 4 and 5 to the bottom
* Paste in workflow specific information into the subsections
 

There are minutes attached to this event. Show them.
    • 15:00 15:05
      Actions, Since Last Meeting, Next Meeting 5m
    • 15:05 15:20
      Discussion: Risk Analysis Scoring 15m

      Inspiration may be taken from these assessments from EGEE and WLCG done many years ago:

      Work through the Workflows added by Maarten to the document, and review the scoring methodology.

      Continue discussion from the list.

    • 15:20 15:45
      Discussion: Workflows 25m

      Probably just continuing the above.

      https://github.com/TTT-WG/TTT-WG/issues

    • 15:55 16:00
      AOB 5m

      Next meeting 23th September (?)