Speaker
Description
Most of the current machine learning (ML) applications are purely data-driven solutions with little considerations for underlying problem dynamics, limited to in-distribution applications. To tackle this limitation a stream of literature is emerging to address out-of-distribution (OOD) performance: Algorithmic alignment, which focuses on embedding algorithmic structures into ML architectures to reflect the inherent logic of the problem at hand. The general idea is summarized in two steps: first, we formalize the dynamics, mathematical workings involved, as well as constraints and assumptions on data, outputs and parameters; second, we design the corresponding ML algorithm that maximally replicates such specification, i.e., we implement its inductive bias (IB).
The relatively recent literature of algorithmic alignment, however, shows a lack of proper characterization of existing algorithms and IBs.
We provide said characterization for our core research focus: acceleration of large-scale scientific simulators. We hypothesise that these have already mathematically embedded domain knowledge, as a result of widely detailed physical phenomena and decades of development of dedicated algorithms. Examples would be sequential, discrete-event simulators such as traffic ones or physically characterized ones such as the ones from hydrodynamics or climate science. The approach is detailed to be transferable to other scientific disciplines and facilitating the application of algorithmic reasoning in ML solutions. We analyze 3 main subjects: traditional inductive biases in ML and how they align with simulators; unconventional inductive biases inspired by the domain knowledge and generalizing power of simulators; algorithmic structures from the most common algorithms in large-scale simulators.
Our analysis of such multidisciplinary perspectives will result in a dictionary of IBs and their connections to specific tasks, which shall guide researchers and practitioners towards more robust ML solutions. We have already characterized simulators in the traffic domain and identified characteristic features of algorithms (such as Dijkstra, Frank Wolfe) and models (like the Cell Transmission Model or Elastic Traffic Assignment) in simulators. Some relevant trends are identified. We find, for example, that more simulators lean on addition and short term memory rather than multiplication or long term memory. The complexity of the simulation at hand is also a strong indicator of modularity, loop invariance and smoothness bias. Most simulators appear to deal with structural sparsity through computational power alone, but few more targeted models actually avoid it. Most simulators can also be separated in multiple components with different IBs and algorithmic structures, whose understanding is critical in designing ML meta-models able to perform OOD as well as current simulators.
The study is designed to be a toolbox for novel efficient architectures embedding specific generalization preferences that mirror the identified IBs. The analysis of common algorithmic structures in simulators will allow to more easily design ML models algorithmically aligned with the problem at hand, isolating non-nonlinearities and improving the expressiveness of the novel architectures. The findings shall be easily replicable to other scientific domains with a huge body of domain knowledge historically already embedded in models, algorithms and simulations. A practical example may be presented.