Speaker
Description
Summary
Non-intrusiveness is one of the primary goals of the project,
in order to ease the effort of introducing multi-threading to the
experiments' large existing code bases.
The concurrent components therefore present an option to the user and do not interfere with
existing sequential production code.
This allows for an incremental revision of the existing implementation and adoption of the components offered.
In order to process data concurrently the user must
- explicitly declare data in- and outputs to each algorithm,
- declare which reusable modules -- called Tools in the Gaudi context -- are used by an algorithm, and
- revise thread-unsafe code such as use of caches, back-channel communication,
race conditions in the update of shared data, ...
The first two points can be achieved by using the newly introduced data and tool handles, respectively.
To ease their use, the interfaces of the handles were designed according to components familiar
to Gaudi developers.
The third point requires a case by case analysis of the operations performed by the algorithm.
Having explicitly stated the algorithms' in- and outputs allows the automated
deduction of the data flow among them.
In Gaudi, the execution path is not only influenced by data dependencies but also by control flow constructs.
These are expressed by grouping algorithms in sequences.
Each algorithm produces a binary decision, which causes the further execution or abortion of its sequence.
By composing several sequences, a complex control flow can be modeled.
Including both data dependencies and control flow constructs in a unified graph,
gives a coherent picture of the relationships between algorithms.
The information required to construct the control flow portion of the graph can be extracted from
the sequence hierarchy automatically and requires no further user intervention.
The unified data- and control-flow graph is a precedence-constrained directed acyclic graph.
The scheduler identifies algorithms with fulfilled data dependencies in the graph,
which are candidates for execution.
An algorithm's execution priority is determined by its connectedness and membership to the critical path;
it can become zero if it is no longer required to execute by the control flow.
Candidates with non-zero priority are submitted by the scheduler for execution in the order of their priorities.
The graph structure also serves a secondary purpose:
it allows the analysis of a given configuration statically for
unfulfillable data dependencies and superfluous control flow constructs.
Furthermore, the length of the critical path and the maximum number of concurrently executable algorithms
can be determined to aid in resource planning.
The gradual migration strategy and the benefits of static configuration analysis
present a strong argument for the adoption of the components and concepts developed in the
concurrent Gaudi project.
The adoption presents an opportunity for general code improvement as well as the possibility
to explore the use of accelerators and co-processors in the future.