2–6 Mar 2009
Le Ciminiere, Catania, Sicily, Italy
Europe/Rome timezone

Toward Responsive Grids through Multi Objective Reinforcement Learning

2 Mar 2009, 18:10
20m
Machiavelli (40) (Le Ciminiere, Catania, Sicily, Italy)

Machiavelli (40)

Le Ciminiere, Catania, Sicily, Italy

Viale Africa 95100 Catania
Oral Scientific results obtained using grid technology Grid Research

Speaker

Julien Perez (LRI, CNRS and Université Paris-Sud)

Description

EGEE has experimented in specialized software and configurations, such as priorities, Virtual Reservations and overlay task-management in order to provide differentiated Quality of Service (QoS) requested by an increasingly diverse user community. To combine differentiated QoS, fair-share and self-configuration under a unique production model, we propose a multi objective Reinforcement Learning (RL) approach for site-level dynamic allocation of grid resources, and to validate it using EGEE traces.

Keywords

Scheduling, Reinforcement Learning

Impact

We developed a simulation framework for experimentation and validation. The most important setting to consider is the non-linear continuous approximation of the value function. We have explored various neural networks architectures for this approximation. Sparse neural networks allow representing a modal variable describing user identity; introducing this variable has a significant performance impact.
The experimental validation exploits various segments (typically one week) of a trace that at the LAL site, equipped with a MAUI/PBS scheduler, and with Virtual Reservations enabled. Various performance metrics are examined, namely the distributions of the original utility function, the relative overhead (ratio of waiting time to execution time), the absolute waiting time, and the distance to the optimal fair-share. The RL scheduler consistently outperforms the native scheduler, w.r.t. QoS, but requires a significant training period, while exhibiting similar faire-share performance.

URL for further information

http://www.grid-observatory.org

Detailed analysis

Our application of RL to site scheduling discovers online a policy that maps the site’s states to the decisions the scheduler ought to take in those states so as to maximize long-term cumulative rewards. Compact descriptions allow steering the scheduling process through high-level objectives. The requirement for differentiated QoS is expressed through parameterized utility functions associated to responsive and batch jobs; as they describe how “satisfied” the user will be if his/her job finishes after a certain time delay, the parameters have intuitive interpretation. The fair-share reward controls the compliance of the scheduling process to the shares given to each VO, which are independently defined by the various grid stakeholders, and can cope with under-utilized shares. Using the State-Action-Reward-State-Action (SARSA) algorithm, the RL scheduler can quickly adapt its decisions to the non-stationary distributions of inter-arrival time, load, and QoS requirements featured by EGEE.

Conclusions and Future Work

Multi-objective RL provides a method to neatly combine heterogeneous goals, and discover policies that satisfy them. This work deals with QoS and fair-share, but green computing objectives could be integrated as well. Ongoing work includes multi-scale reinforcement learning to handle the difference of time scale of the objective functions, and hybrid methods where offline RL, online RL, and the native scheduler are exploited in order to cope with the transitory periods.

Authors

Balazs Kegl (LAL, CNRS and Université Paris-Sud) Cecile Germain-Renaud (LRI, CNRS and Universiité Paris-Sud) Charles Loomis (LAL, CNRS and Université Paris-Sud) Julien Perez (LRI, CNRS and Université Paris-Sud)

Presentation materials