RooFit is the statistical modeling and fitting package used in many experiments to extract physical parameters from reduced particle collision data. RooFit aims to separate particle physics model building and fitting (the users' goals) from their technical implementation and optimization in the back-end. In this talk, we outline our efforts to further optimize the back-end by automatically running major parts of user models in parallel on multi-core machines.
A major challenge is that RooFit allows users to define many different types of models, with different types of computational bottlenecks. Our automatic parallelization framework must then be flexible, while still reducing run-time by at least an order of magnitude, preferably more.
We have performed extensive benchmarks and identified at least three bottlenecks that will benefit from parallelization. To tackle these and possible future bottlenecks, we designed a parallelization layer that allows us to parallelize existing classes with minimal effort, but with high performance and retaining as much of the existing class's interface as possible.
The high-level parallelization model is a task-stealing approach. Our multi-process approach uses ZeroMQ socket-based communication. Preliminary results show speed-ups of factor 2 to 20, depending on the exact model and parallelization strategy.
We will integrate our parallelization layer into RooFit in such a way that impact on the end-user interface is minimal. This constraint, together with new features introduced in a concurrent RooFit project on vectorization and dataflow redesign, warrants a redesign of the RooFit internal classes for likelihood evaluation and other test statistics. We will briefly outline the implications of this for users.
|Consider for promotion||Yes|