Visualization of dCache accounting information with state-of-the-art Data Analysis Tools.

Apr 14, 2015, 5:15 PM
B250 (B250)



oral presentation Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing Track 4 Session


Mr Tigran Mkrtchyan (DESY)


Over the previous years, storage providers in scientific infrastructures were facing a significant change in the usage profile of their resources. While in the past, a small number of experiment frameworks were accessing those resources in a coherent manner, now, a large amount of small groups or even individuals request access in a completely chaotic way. Moreover, scientific laboratories have been recently forced to provide detailed accounting information for their communities and individuals. Another consequence of the chaotic access profiles is the difficulty, for often rather small operating teams, to detect malfunctions in extremely complex storage systems, composed of a large variety of different hardware components. Although information about usage and possible malfunction is available in the corresponding log and billing files, the sheer amount of collected meta data makes it extremely difficult to be handled or interpreted. Simply the dCache production instances at DESY are producing Gigabytes of meta data per day. To cope with those pressing issues, DESY has been evaluating and put into production a Big Data processing tool, enabling our operation team to analyze log and billing information by providing a configurable and easy to interpret visualization of that data. This presentation will demonstrate how DESY built a real-time monitoring system, visualizing dCache billing files and providing an intuitive and simple to operate Web interface, using ElasticSearch, Logstash and Kibana.

Primary author


Presentation materials