Oliver Keeble (CERN)
The overall success of LHC data processing depends heavily on stable, reliable and fast data distribution. The Worldwide LHC Computing Grid (WLCG) relies on the File Transfer Service (FTS) as the data movement middleware for moving sets of files from one site to another. This paper describes the components of FTS3 monitoring infrastructure and how they are built to satisfy the common and particular requirements of the LHC experiments. We show how the system provides a complete and detailed cross-virtual organization (VO) picture of transfers for sites, operators and VOs. This information has proven critical due to the shared nature of the infrastructure, allowing a complete view of all transfers on shared network links between various workflows and VOs using the same FTS transfer manager. We also report on the performance of the FTS service itself, using data generated by the aforementioned monitoring infrastructure both during the commissioning and the first phase of production. We also explain how this monitoring information and network metrics produced can be used both as a starting point for troubleshooting data transfer issues, but also as a mechanism to collect information such as transfer efficiency between sites, achieved throughput and its evolution over time, most common errors, etc, and take decision upon them to further optimize transfer workflows. The service setup is subject to sites policies to control the network resource usage, as well as all the VOs making use of the Grid resources at the site to satisfy their requirements. FTS3 is the new version of FTS and has been deployed in production in August 2014.