ACAT 2016

Name: ACAT 2016
Start: 2016-01-18T08:00:00-03:00
End: 2016-01-22T18:00:00-03:00
Location: UTFSM, Valparaíso (Chile)

18–22 Jan 2016

UTFSM, Valparaíso (Chile)

Chile/Continental timezone

Secretary

acat2016@usm.cl

Performance and Advanced Data Placement Techniques with Ceph’s Distributed Storage System

21 Jan 2016, 15:45

25m

UTFSM, Valparaíso (Chile)

Avenida España 1680, Valparaíso Chile

Oral Computing Technology for Physics Research Track 1

Michael Poat (Brookhaven National Laboratory)

The STAR online computing environment is a demanding concentrated multi-purpose compute system with the objective to obtain maximum throughput and process concurrency. Motivation for extending the STAR compute farm from a simple job processing tool for data taking, into a multipurpose resource equipped with a large storage system would lead any dedicated resources to become an extremely efficient and an attractive multi-purpose facility. To achieve this goal, our compute farm is using the Ceph distributed storage system which has proven to be an agile solution due to its successful POSIX interface and excelling its object storage in I/O concurrency. With this we have taken our cluster one step further by squeezing more performance with investigating and leveraging new technologies and key features of Ceph. With an acquisition of a 10Gb backbone network we have ensured to eliminate the network as a limitation. With further acquisition of large fast drives (1TB SSDs) we will also show how one can customize the placement of data and make good use of the I/O performance tweaking options Ceph has to offer. Finally, we will be discussing OSD Pool mapping in the context of redundancy based on compute racks, rows, PDU’s and other physical parameters. We will also present and discuss the cost comparatives of our cluster with other traditional storage systems such as NAS and SAN and the performance of using older hardware to work as one cooperative storage system. We will present our latest performance results as well as the stability, lessons learned, and overall experience with the STAR Ceph cluster and the steps taken to mitigate the problems we’ve come across. Furthermore we will present the tools we used to manage, maintain, and monitor the Ceph cluster with the use of tools such as the CFEngine configuration management tool and the Icinga Infrastructure monitoring system giving the STAR admins a bird’s eye view of the cluster state and a centrally managed point to ensure configuration consistency. We hope our presentation will serve the community’s interest for the Ceph distributed storage solution.

Jerome LAURET (Brookhaven National Laboratory) Michael Poat (Brookhaven National Laboratory)

ACAT_2016_slides_12.pdf

ACAT_2016_slides_12.pptx

ACAT2016_Advanced_Data_Placement_Ceph_9_NEW_TEMPLATE.pdf

ACAT 2016

Secretary

Performance and Advanced Data Placement Techniques with Ceph’s Distributed Storage System

UTFSM, Valparaíso (Chile)

Speaker

Description

Authors

Presentation materials

Peer reviewing

Paper

Choose timezone

ACAT 2016

Secretary

Speaker

Description

Authors

Presentation materials

Peer reviewing

Paper