CS3 Workshop on Cloud Services for File Synchronisation and Sharing

Name: CS3 Workshop on Cloud Services for File Synchronisation and Sharing
Start: 2017-01-30T06:15:00+01:00
End: 2017-02-01T19:25:00+01:00
Location: SURFSara

30 January 2017 to 1 February 2017

SURFSara

Europe/Zurich timezone

Support

cs3-surfnet2017@cern.ch

Automated error handling at Dropbox

31 Jan 2017, 11:40

40m

Amsterdam (SURFSara)

Amsterdam

SURFSara

Science Park

Technology

Karoly Nagy (Dropbox Inc.) Maxim Bublis (Dropbox Inc.)

At Dropbox, with 1000s of MySQL servers, failures like hardware errors are normal, not exceptional. There is no day passing by without replacing at least 1 server with some kind of hardware error. Our on-call engineers are not alerted for these, they are alerted if the automation is not working properly.
This kind of automation is harder with stateful systems, so we wrote a general framework for that called Wheelhouse. In this framework, state machines are describing the good states of systems, and the transition steps between them.
In this talk we will show the following:

What happens with a slave in case of hardware error
What happens with a master in case of hardware error
What happens when we would like to upgrade kernels
How are we using this framework to coordinate schema changes between shards
How are we using this framework to verify data consistency

Karoly Nagy (Dropbox Inc.) Maxim Bublis (Dropbox Inc.)

cs3_automation_dropbox.pdf

CS3 Workshop on Cloud Services for File Synchronisation and Sharing

Support

Automated error handling at Dropbox

Amsterdam

SURFSara

Speakers

Description

Primary authors

Presentation materials

Choose timezone

CS3 Workshop on Cloud Services for File Synchronisation and Sharing

Support

Speakers

Description

Primary authors

Presentation materials

Share this page

Direct link

Social networks

Calendaring