Discussion of the minimum set of monitoring attributes for data transfer and data access

Europe/Zurich
Description

There is a draft of the minimum set of attributes we will discuss today:

https://docs.google.com/document/d/1tBECfGHk4AybPoorpEe2WiBwYH9zodv-4shiW1RGUv4/edit

 

Based on the received feedback, I've prepared the list of questions , all people with the link should be able to edit this list and add points to it:

https://docs.google.com/document/d/16-amJLp5oh3-sjReXh2djkeLdWaZmPdPYC_OCj8LQ9Y/edit

 

Videoconference
WLCG Operations Coordination
Zoom Meeting ID
61934915211
Description
WLCG Operations Coordination
Host
Maarten Litmaath
Alternative host
Julia Andreeva
Useful links
Join via phone
Zoom URL

Attended:

Borja, Julia, Alessandra, Ewoud (WLCG monitoring task force)

Costin, Maarten (ALICE)

David C. (ATLAS)

Federica (CMS)

Concezio, Christophe (LHCb)


 

List of questions discussed during the meeting 27.10.2022


 

The scope


 

The focus was given to remote data transfer via HTTP and xrootd protocols handled by FTS or remote access via xrootd. In these cases we consider the following information sources of the monitoring data : FTS, xrootd servers or dCache servers (dCache with xrootd protocol). Do we miss something?


 

Would be good to understand, in case we want to use the same schema for xCache monitoring or staging in and out to the WNs from the local storage, would the schema be good enough or not.


 

—-

We foresee monitoring of all data traffic both remote and local one. We can distinguish remote from local  comparing source and destination site. Set of attributes which we are discussing looks to be fine with both use cases. In order to distinguish staging in/out from data access we will use activity which every experiment was asked to define.


 

Information sources


Do we need to include them as a reported meta attribute, or it can be a derived attribute defined from the data flow itself (different sources will use different data flows)


 

—--

There should be a possibility to see a data source on the UI. How it is implemented is a technical detail. There is no need to include it in the list of meta attribute reported from the source, since for each of them (xrootd, FTS, dCache) we will have a separate data flow.



 

IP address or hostname


 

The hostname is not an attribute which is supposed to be exposed to the final user. It will rather be used to resolve source and destination site names. Do we need to care whether it is user friendly or not in this case and do not require an expensive translation of IP address to hostname?


—---

Currently we have hostname for FTS and IP address for xroootd. Probably, we do not need to enforce either one or another one assuming that we can translate both into site name. Probably IP address can be reported with the same attribute name as the hostname. 

 

Question to be clarified by Borja:

How current implementation of the xrootd collector does translation of the ip address into site name?



 

Error category and error code


 

Would suggestion from Costin to have protocol+error code from the server be good enough?


 

Currently we consider FTS, xrootd and dCache servers


 

—--

FTS reports both numeric arrow code (gfal2) and error category. Would be good to have the same thing from xrootd. To be discussed with xrootd developers. We should avoid resolving error category on the MONIT side, this is not sustainable.


 

Wouldn't a {multi_source; multi_hop; overwrite} flag information be useful


 

No comments on this comment in the googledoc. I am not sure whether these are useful flags in the context we are discussing?

 

—---

Though it can be useful for debugging of particular transfers, we might not need it in the monitoring of WLCG traffic.


 

Something else?


 

We agreed that the draft is ready to be discussed with the development teams (xRootD, dCache, FTS ) for further implementation

There are minutes attached to this event. Show them.
The agenda of this meeting is empty