Jul 9 – 13, 2018
Sofia, Bulgaria
Europe/Sofia timezone

ATLAS Analytics and Machine Learning Platforms

Jul 9, 2018, 3:00 PM
Hall 9 (National Palace of Culture)

Hall 9

National Palace of Culture

presentation Track 6 – Machine learning and physics analysis T6 - Machine learning and physics analysis


James Catmore (University of Oslo (NO))


In 2015 ATLAS Distributed Computing started to migrate its monitoring systems away from Oracle DB and decided to adopt new big data platforms that are open source, horizontally scalable, and offer the flexibility of NoSQL systems. Three years later, the full software stack is in place, the system is considered in production and operating at near maximum capacity (in terms of storage capacity and tightly coupled analysis capability). The new model provides several tools for fast and easy to deploy monitoring and accounting. The main advantages are: ample ways to do complex analytics studies (using technologies such as java, pig, spark, python, jupyter), flexibility in reorganization of data flows, near real time and inline processing. The analytics studies improve our understanding of different computing systems and their interplay, thus enabling whole-system debugging and optimization. In addition, the platform provides services to alarm or warn on anomalous conditions, and several services closing feedback loops with the Distributed Computing systems. Here we briefly describe the main system components and data flows, but will concentrate on both hardware and software tools we use for in depth analytics/simulations, support for machine learning algorithms, specifically artificial neural network training and reinforcement learning techniques. We describe several applications the platform enables, and discuss ways for further scale up.

Primary authors

Ilija Vukotic (University of Chicago (US)) Federica Legger (Ludwig Maximilians Universitat (DE)) Dario Barberis (Università e INFN Genova (IT)) James Catmore (University of Oslo (NO))

Presentation materials