Notes ARC CE
- Used like any other CE, payloads submitted no data handling. VOs handle file transfers and credentials.
- But also can offer more features, used particularly by HPC. e.g. NDGF. Have to play by rules of local sites i.e. no pilots. Worker nodes may be offline. ARC CE does pre-fetching of data (normally cached). Also output data handling (authorisation required) and cache cleanup.
- File name list input with job description, states what needs to exist at the site for the job to run. Output files may be generated by the job, not always known in advance.
- ARC CE workflow
- Downloads files if they don’t already exist (delegated authZ with VOMS roles)
- When all inputs in place, job goes to batch system. Executed off-line, doesn’t normally need access to credentials
- Job leaves Batch system, then ARC CE takes over and uploads files to destination. At this point credential renewal is often needed
- Credentials may be needed for accounting, cache clean up or cancellation etc.
- Different kind of jobs; production, monitoring etc but share general workflow
- We need people from experiment frameworks to participate in this discussion
- AuthZ WG should come out with some recommendations
- Some are already out there since e.g. htcondor has implemented them
- Hackathon might be appropriate to make progress (last CE hackathon https://indico.cern.ch/event/1032742/) see GlideinWMS slides
- We started defining capabilities for compute and storage, some assumption that different tokens would be used
- It will be down to the VOs for how they decide to fill the tokens
- Steep learning curve when coming in to OAuth
- Need very clear recommendations with few choices
- Pilot case largely solved for ARC, how?
- Pilot infrastructure takes care of token success, control etc. Fewer questions.
- Pilot jobs the pilots do the file movements
- VO frameworks important, how can we get them involved?
- Panda (need to find out what’s been going on)
- Dirac (hopefully covered by Andrii)
- GlideinWMS (CMS, covered by Brian)
- HTcondor plans to shut down X.509 support by autumn next year
- Timeline https://docs.google.com/document/d/11fcZU8fEsfjDiSkjh95nVr4tNXLPCA_xwr2SwriBpiw/edit#heading=h.s0j7quda1urv
- Grid community plans to keep Globus alive during run 3
Questions:
- How can a token be acquired?
- For programmatic access use standard OAuth protocols to get tokens from token issuer, e.g. client credentials flow
- Useful for e.g. pilot submission where the token is not issued to an end user
- (Let’s drop pilot discussion since that’s clear, let’s focus on the user token flows)
- Also some tools for users to get user tokens
- Can we assume that the token can be issued and refreshed transparently?
- Do we need to know what’s in the tokens?
- Would expect that get a single token for a job
Actions
- WLCG AuthZ WG to come up with recommendations and guidelines
- Currently looking at this for FTS and Rucio (experiment developers needed)
- Will try and extract more generic recommendations afterwards