Feb 11 – 14, 2008
<a href="http://www.polydome.org">Le Polydôme</a>, Clermont-Ferrand, FRANCE
Europe/Zurich timezone

Testing 65536 parallel pseudo-random number streams

Feb 12, 2008, 4:00 PM
Exhibition Hall (<a href="http://www.polydome.org">Le Polydôme</a>, Clermont-Ferrand, FRANCE)

Exhibition Hall

<a href="http://www.polydome.org">Le Polydôme</a>, Clermont-Ferrand, FRANCE

Poster Scientific Results Obtained Using Grid Technology Posters

Speaker

Mr Romain Reuillon (LIMOS)

Description

Monte Carlo simulations are typical Grid applications, they are considered as naturally parallel because many replications of the same experiment can be distributed on multiple execution units to reduce the global simulation time. However, one needs to take care of the underlying random number streams and ensure that the generated streams do not show intra or inter-correlations. TestU01 is a well known stringent sequential statistical tests battery that aims to detect defaults on pseudo-random number sequences. Matsumoto designed a parallel version of a very good and famous pseudo-random generation algorithm called Mersenne Twister. With a parameterization technique, we have generated independent parallel Mersenne Twisters that have to be tested for statistical deficiencies using TestU01. The best generators can then be safely used in parallel for nuclear medicine Monte Carlo simulations.

4. Conclusions / Future plans

We have tested 65536 independent pseudo-random number streams. To achieve this work, we have installed our test battery on 54 tagged computing elements (CE – VO BIOMED). The scheduling of jobs has been entirely done by resource brokers. The output text files, weighing each around 100 KB, were collected using the output sandboxes.
The next step is to test cross-correlations between the different pseudo-random number streams. The amount of work is growing exponentially with the number of streams.

Provide a set of generic keywords that define your contribution (e.g. Data Management, Workflows, High Energy Physics)

Statistical tests, parallel pseudo-random numbers, Mersenne Twister, parameterization, Monte Carlo.

1. Short overview

Some Monte Carlo simulations execute many independent replications to converge. They should therefore be considered as killer applications for the grid infrastructure. However distributing stochastic simulations requires many independent high-quality pseudo-random number streams. We have run a statistical test battery on the EGEE grid in order to test 65535 streams generated by a recent parallel pseudo-random generator: the parametric Mersenne Twister.

3. Impact

We have generated 2^16 parameters for the Mersenne Twister parameterization algorithm. This leads to 65536 different Mersenne Twisters which have to be tested separately, knowing that the full test battery can take more than 24 hours on nowadays processors. In order to dispatch this computing load, we have then used the DistMe software framework to generate jobs for the runtime management software package called Ganga. Each job is testing one of the generators. We have run this huge set of jobs in separated Ganga instances to accelerate the job submissions. Each job ran during 8 hours (for a total of 60 CPU years), we could not achieve this kind of task without a computing grid. Such independent random streams are crucial in parallel Monte Carlo simulations for nuclear medicine.

Primary author

Co-author

Prof. David Hill (LIMOS)

Presentation materials

There are no materials yet.