9–13 Jul 2018
Sofia, Bulgaria
Europe/Sofia timezone

Application of Deep Learning on Integrating Prediction, Provenance, and Optimization

10 Jul 2018, 16:00
1h
Sofia, Bulgaria

Sofia, Bulgaria

National Culture Palace, Boulevard "Bulgaria", 1463 NDK, Sofia, Bulgaria
Poster Track 6 – Machine learning and physics analysis Posters

Speakers

Dr Malachi Schram Malachi Schram (Pacific Northwest National Laboratory)

Description

We investigate novel approaches using Deep Learning (DL) for efficient execution of workflows on distributed resources. Specifically, we studied the use of DL for job performance prediction, performance classification, and anomaly detection to improve the utilization of the computing resources.

  • Performance prediction:
  • capture performance of workflows on multiple resources
  • consider intra-node task assignment

  • Performance classification: Prediction of job success/failure

  • Predict at regular intervals job succeed/fail - site reliability
  • Long short-term memory (LSTM) neural networks

  • Performance anomaly detection:

  • Example: Functions that consume unexpectedly large/small amounts of time

We used the Belle II distributed computing workflow and modifications to the DIRAC system for these studies.

Primary authors

Dr Malachi Schram Dr Nathan Tallent (Pacific Northwest National Laboratory) Dr Ryan Friese (Pacific Northwest National Laboratory) Alok Singh (University of California San Diego)

Presentation materials