Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

Op Int Meeting

Europe/Zurich
zoom

zoom

https://cern.zoom.us/j/91088780054
  • This meeting is dedicated to the FTS log analysis.
  • next meeting (Nov 2) will be dedicated to Grafana annotations
  • IML workshop this week
  • Potentially interesting dissertation on Learning Structure for Computer Systems Management

https://uchicago.zoom.us/j/4244311930?pwd=NVFEdjRsV1ZaT25sSGh6bjdpQkpxdz09

Password: 127022

      Department of Computer Science/The University of Chicago

                    *** Dissertation Defense ***

Candidate:  Yi Ding
Date:  Tuesday, November 3, 2020
Time:  10:00 AM
Place:  via zoom
Title: Learning Structure for Computer Systems Management

Abstract:
Modern computer systems expose diverse configurable parameters whose complicated interactions have surprising effects on performance and energy. This puts a great burden on systems designers and researchers to manage such complexity. Machine learning (ML) creates an opportunity to alleviate this burden by modeling resources' complicated, non-linear interactions and deliver an optimal solution to scheduling and resource management problems. However, naively applying traditional ML methods, such as deep learning, creates several challenges including generalization, robustness, and interpretability. A lack of generalizability and robustness in the ML models is largely due to the scarcity and bias of the training data. Causal inference creates an opportunity to tackle these challenges by analyzing observational data rather than data generated from randomized experiments. Since causal inference inherently studies the causal relationships---underlying structure---rather than correlation between features, it also provides interpretable systems results.
This dissertation presents the algorithms and systems we developed to improve systems outcomes by applying ML along with key techniques from causal inference. First, we describe learning for systems optimization with scarce data and system structure. In particular, we propose a novel generative model to address the data scarcity issue and a multi-phase sampling approach by exploiting system structure. Our results show after achieving a certain level of accuracy, it is no longer profitable for systems researchers to improve learning systems without accounting for the structure. Thus we advocate that future work on learning for systems should de-emphasize accuracy and instead incorporate the system problem's structure into the learner. Second, we describe Sherlock, a causal straggler prediction framework for datacenters. Straggers are rare events that exhibit extreme tail latencies, which lead to imbalance in the training data. To address data imbalance issue, Sherlock augments correlation-based learning with causal analysis without prior knowledge. To effectively mitigate stragglers, Sherlock applies permutation feature importance (PFI) to gain insights into the straggling behavior for further system intervenation. Sherlock's combination of PS and PFI allows it to make accurate, interpretable predictions from imbalanced training data.
This work is evidence that causal analysis is effective in delivering more generalizable, robust, and interpretable systems.

Yi's advisor is Prof. Henry Hoffmann

There are minutes attached to this event. Show them.