- Alice report (only VO to report during the meeting):
- First document is high level and was already discussed during the meeting: Alice already follows these recommendations
- Alice keeps all information mentioned in the second document. However, the format used is different and Alice does not use syslog but an internal messaging solution
- Discussion on log format uniformity:
- Maarten: If we were to start from scratch, it would be doable. But changing the log format now would be horribly difficult and would cost a lot of efforts
- Ian: We still should try to avoid duplication of efforts and avoid using different tools and formats
- CMS can't directly modify its logging format: CMS relies on GlideinWMS, which may be open to suggestion, but not a complete rewrite is impossible
- Maarten proposed that the VOs could send their data to a central cluster (e.g. maintained by the Security Team) which could process and normalized them.
- Vincent replied that the CERN Security Team currently can't cope with such data/load
- Agreement that we might find parts of each VO logging infrastructure that might be enhance, but that we should no try to rebuild a new logging system
- Self assessment discussion:
- Vincent asked if it would make sense for VOs to try to self-asses themselves by:
- Picking a random Pilot Job from the previous week
- Pick a random job ran during the life time of that pilot job
- Try to see how expensive the identification of the owner of that job and his activity/payload would be
- Maarten commented that similar exercise were done in the past, in particular a security challenge ran against panda/ATLAS. Such a real challenge could be very expensive.
- Ian explained how, within the UK NGI, he was testing the argus deployment by running a job that would query a non existing file on a web server: the webserver logs easily show where the job ran
- Proposed self-assessment for VOs:
- Identify what would be the cost and potential bottleneck, given a restricted time period and the IP address of a worker node, of identifying matching jobs and the corresponding user and payload
- Such an assessment should be taken under the assumption that this WG will succeed in building a complete isolation layer between the pilot and the jobs and should thus ignore pilot-job take-over or hidden daemonized processes scenarios
- Action: perform the proposed self-assessment