Setup of a resilient FTS3 service at GridKa

Not scheduled
15m
OIST

OIST

1919-1 Tancha, Onna-son, Kunigami-gun Okinawa, Japan 904-0495
poster presentation Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing

Speaker

Thomas Hartmann (KIT - Karlsruhe Institute of Technology (DE))

Description

The FTS service provides a transfer job scheduler to distribute and replicate waste amounts of data over the heterogeneous WLCG infrastructures. The most recent version FTS3 simplifies and improves the flexibility compared to the channel model of the previous incarnations while reducing the load to the service components. The improvements allow to handle a higher number of transfers with a single FTS3 setup. Covering now continent-wide transfers compared to the previous version handling only transfers related to specific clouds, a resilient system becomes even more necessary with the increased number of depending users. Having set up a FTS3 services at the German T1 site *GridKa* at *KIT* in Karlsruhe, we present our experiences on the preparations for a high-availability FTS3 service. Trying to avoid single points of failure, we rely on a database cluster as fault tolerant data back-end and the FTS3 service deployed on an own cluster setup to provide a resilient infrastructure for the users. With the database cluster providing a basic resilience for the data back-end, we ensure on the FTS3 service level a consistent and reliable database access through a proxy solution. On each FTS3 node a HAproxy instance is monitoring the health of each database node and distributes database queries over the whole cluster for load balancing during normal operations; in case of a broken database node, the proxy excludes it transparently to the local FTS3 service. The FTS3 service itself consists of a main and a backup instance, which takes over the identity of the main instance, i.e., IP, in case of an error using a CTDB infrastructure offering clients a consistent service.

Primary author

Thomas Hartmann (KIT - Karlsruhe Institute of Technology (DE))

Co-authors

Andreas Petzold (KIT - Karlsruhe Institute of Technology (DE)) Kamil Wisniewski (Karlsruhe Institut for Technology) Ms Ludmilla Obholz (KIT Karlsruhe) Preslav Borislavov Konstantinov (Bulgarian Academy of Sciences (BG))

Presentation materials