9–13 Jul 2018
Sofia, Bulgaria
Europe/Sofia timezone

Ceph File System for the CERN HPC Infrastructure

10 Jul 2018, 15:45
15m
Hall 8 (National Palace of Culture)

Hall 8

National Palace of Culture

presentation Track 4 - Data Handling T4 - Data handling

Speaker

Herve Rousseau (CERN)

Description

The Ceph File System (CephFS) is a software-defined network filesystem built upon the RADOS object store. In the Jewel and Luminous releases, CephFS was labeled as production ready with horizontally scalable metadata performance. This paper seeks to evaluate that statement in relation to both the HPC and general IT infrastructure needs at CERN. We highlights the key metrics required by four users, including: POSIX compliance, single-user small-file latency, multi-user metadata throughput, and metadata horizontal scalability and failure tolerance. We will report about the experience so far and future evolution of the service.

In addition, we describe a new suite of micro-benchmark which measure the small latencies of buffered and synchronous filesystem operations and can be used to quantify the evolving quality of a storage cluster over time. We also introduce a simple ping-like latency tool---fsping---which evaluates the time needed for two clients to notice file modifications in a shared filesystem. These tests, in combination with several classical HPC benchmarks run in single- and multi-user scenarios, paint a picture of CephFS which is emerging as a viable option for HPC storage and NFS-appliance replacements.

Primary authors

Dan van der Ster (CERN) Massimo Lamanna (CERN) Theofilos Mouratidis (National and Kapodistrian University of Athens (GR)) Herve Rousseau (CERN) Dr Arne Wiebalck (CERN) Pablo Llopis Sanmillan (CERN)

Presentation materials