Speaker
Description
4. Conclusions / Future plans
We have tested 65536 independent pseudo-random number streams. To achieve this work, we have installed our test battery on 54 tagged computing elements (CE – VO BIOMED). The scheduling of jobs has been entirely done by resource brokers. The output text files, weighing each around 100 KB, were collected using the output sandboxes.
The next step is to test cross-correlations between the different pseudo-random number streams. The amount of work is growing exponentially with the number of streams.
Provide a set of generic keywords that define your contribution (e.g. Data Management, Workflows, High Energy Physics)
Statistical tests, parallel pseudo-random numbers, Mersenne Twister, parameterization, Monte Carlo.
1. Short overview
Some Monte Carlo simulations execute many independent replications to converge. They should therefore be considered as killer applications for the grid infrastructure. However distributing stochastic simulations requires many independent high-quality pseudo-random number streams. We have run a statistical test battery on the EGEE grid in order to test 65535 streams generated by a recent parallel pseudo-random generator: the parametric Mersenne Twister.
3. Impact
We have generated 2^16 parameters for the Mersenne Twister parameterization algorithm. This leads to 65536 different Mersenne Twisters which have to be tested separately, knowing that the full test battery can take more than 24 hours on nowadays processors. In order to dispatch this computing load, we have then used the DistMe software framework to generate jobs for the runtime management software package called Ganga. Each job is testing one of the generators. We have run this huge set of jobs in separated Ganga instances to accelerate the job submissions. Each job ran during 8 hours (for a total of 60 CPU years), we could not achieve this kind of task without a computing grid. Such independent random streams are crucial in parallel Monte Carlo simulations for nuclear medicine.