Vincent Garonne (CERN)
The DDM Tracer Service is aimed to trace and monitor the atlas file operations on the Worldwide LHC Computing Grid. The volume of traces has increased significantly since the service started in 2009. Now there are about ~5 million trace messages every day and peaks of greater than 250Hz, with peak rates continuing to climb, which gives the current service structure a big challenge. Analysis of large datasets based on on-demand queries to the relational database management system (RDBMS), i.e. Oracle, can be problematic, and have a significant effect on the database's performance. Consequently, We have investigated some new high availability technologies like messaging infrastructure, specifically ActiveMQ, and key-value stores. The advantages of key value store technology are that they are distributed and have high scalability; also their write performances are usually much better than RDBMS, all of which are very useful for the Tracer service. Indexes and distributed counters have been also tested to improve query performance and provided almost real time results. In this talk, the design principles, architecture and main characteristics of Tracer monitoring framework will be described and examples of its usage will be presented.
Collaboration Atlas (Atlas)