Hydra

A library for data analysis in multithreaded platforms

A. Augusto Alves Jr ( University of Cincinnati )

CERN Geneve, 31 October 2016

aalvesju@cern.ch

Presented in the DIANA Project Meeting.

Outline

Design, strategies and goals
Core Functionalities
- Functors
- Data containers
- Generic function evaluation
Examples
- Multidimensional numerical integration
- Multidimensional random number generation
- Phase-Space Monte Carlo
- Interface to Minuit2
Summary and prospects

Hydra

Hydra is a header only templated C++ library to perform data analysis on massively parallel platforms.

hydra-fig

It is implemented on top of the C++ Standard Library and a variadic version of the Thrust library.
Hydra runs on Linux systems and can perform calculations using OpenMP, Cuda and TBB enabled devices.
It is focused on portability, usability, performance and precision.

Design

The main design features are:

The library is structured using static polymorphism.
Type safety is enforced at compile time in order to avoid the production of invalid or performance degrading code.
No function pointers or virtual functions at all: stack behavior is known at compile time.
No need to write explicit device code.
Clean and concise semantics. Interfaces are designed to be easy to use correctly and hard to use incorrectly.
Common C++ idioms: container iterators, named parameter semantics, RAII, CRTP…

The same source files written using Hydra components and standard C++ compiles on GPU or CPU just exchanging the extension from .cu to .cpp.

Functionalities currently implemented

Generation of phase-space Monte Carlo Samples with any number of particles in the final states. Weighed and unweighted samples are supported.
Sampling of multidimensional PDFs.
Data fitting of binned and unbinned multidimensional data sets.
Evaluation of multidimensional functions over heterogeneous data sets.
Numerical integration of multidimensional functions using Monte Carlo-based methods: flat or self-adaptive (Vegas-like)

Functors

In Hydra most of the calculations are performed using function objects. Hydra add features, type information and interfaces to generic functors using the CRTP idiom.

//Gaussian with two parameters: mean and sigma
struct Gauss:public BaseFunctor<Gauss,double, 2>
{
  //client need to implement the Evaluate(T* ) method for homogeneous data.
  template<typename T>
  __host__ __device__
  inline double Evaluate(T* x){/* implementation */}

  //or heterogeneous data. In this case <T> represents a variadic tuple.
  template<typename T>
  __host__ __device__
  inline double Evaluate(T x){/* implementation */}
}

For all functors deriving from hydra::BaseFunctor<Func, ReturnType, NParams>, if the calculation is expensive or the functor will be called many times, the results of the first call can be cached.

Arithmetic and composition with functors

Hydra provides a lot “syntax sugar” to deal with function objects.

All the basic arithmetic operators are overloaded. Composition is also possible. If A, B and C are Hydra functors, the code below is completely legal.

 //basic arithmetic operations
 auto plus_functor = A + B;  auto minus_functor = A - B;
 auto multiply_functor = A * B;  auto divide_functor = A/B; 

 //composition of basic operations
 auto any_functor = (A - B)*(A + B)*(A/C);
 
 // C(A,B) is represented by:
 auto composed_functor = compose(C, A, B)

The functors resulting from arithmetic operations and composition can be cached as well.
No intrinsic limit on the number of functors participating on arithmetic or composition mathematical expressions.

Support for C++11 lambdas

Lambda functions are a very precious C++ resources that allow the implementation of new functionalities on-the-fly. These closures can hold state, capture variables defined in the enclosing scope etc…

Lambda are fully supported in Hydra.
In the client code one can define a lambda function at any point and convert it into a Hydra functor using hydra::wrap_lambda():

double two = 2.0;
//define a simple lambda and capture "two" by value
auto my_lambda = [] __host__ __device__(double* x)
    { return two*sin(x[0]); };

//convert is into a Hydra functor
auto my_lamba_wrapped  = wrap_lambda(my_lambda);

CUDA 8.0 supports lambda functions on device and host code.

Data containers

Hydra algorithms can operate over any iterable C++ container. Hydra provides hydra::PointVector.
hydra::PointVector is an iterable collection of hydra::Point objects: hydra::Point represents multidimensional data points with coordinates and value, error of the coordinates and error of the value.
hydra::Point uses conditional base class members and methods injection in order to save memory and stack size.

 //two dimensional data set
 PointVector<device, double, 2> data_d(1e6);

 // fill data_d and get data from device and fill a ROOT 2D histogram
 PointVector<host> data_h(data_d);

 TH2D hist("hist", "my histogram", 100, min, max); 
for(auto point : data_h )
{
   std::cout << point << std::endl; // stream the points to std::out
   hist.Fill(point.GetCoordinate(0), point.GetCoordinate(1));
   }

Generic function evaluation

Functors can be evaluated over large data sets using the template function hydra::Eval.

hydra::Eval will return a vector with the results.
hydra::Range specifies the input ranges

//allocate a vector [nentries] real numbers in the device (note the _d subscript )
RealVector_d angles_d(nentries);
// Generate angles uniformly in the range
// generation will be performed in the device
Generator.Uniform(0 , 2*PI, angles_d.begin(), angles_d.end() );

// A lambda to calculate sin(x)
auto sinL = [] __host__ __device__(double* x){ return sin(x[0]);};
auto sinW = wrap_lambda(sinL);

// A lambda to calculate cos(x)
auto cosL = [] __host__ __device__(double*  x){ return cos(x[0]);};
auto cosW = wrap_lambda(cosL);

// Evaluation
auto functors = thrust::make_tuple( sinW, cosW);
auto range    = make_range( angles_d.begin(), angles_d.end());
auto result   = Eval( functors, range);

Multidimensional PDF sampling

The template class hydra::Random manages the multidimensional function sampling in Hydra.

Generic PDF sampling using accept-reject method.
hydra::Random provides a number of basic 1D distributions: Uniform, Breit-Wigner…
Can be configured via template parameters (policies) to use different random number generators.

Methods take the target's container iterators as input and perform the generation on the host or device backend.

//Random object with current time count as seed.
Random<thrust::random::default_random_engine>
Generator( std::chrono::system_clock::now().time_since_epoch().count() );
//1D host buffer
hydra::mc_host_vector<double>   data_h(nentries);
//uniform
Generator.Uniform(-5.0, 5.0, data_h.begin(), data_h.end());
//gaussian
Generator.Gauss(0.0, 1.0, data_h.begin(), data_h.end());
//exponential
Generator.Exp(1.0, data_h.begin(), data_h.end());
//breit-wigner
Generator.BreitWigner(2.0, 0.2, data_h.begin(), data_h.end());

Multidimensional PDF sampling

//------------------------------
// Two gaussians hit-and-miss
//------------------------------
// Gaussian 1
std::array<double, 2>  means1  ={2.0, 2.0 };
std::array<double, 2>  sigmas1 ={1.5, 0.5 };
Gauss<2> Gaussian1(means1, sigmas1 );

// Gaussian 2
std::array<double, 2>  means2  ={-2.0, -2.0 };
std::array<double, 2>  sigmas2 ={0.5, 1.5 };
Gauss<2> Gaussian2(means2, sigmas2 );

//add the pdfs
auto Gaussians = Gaussian1 + Gaussian2;

//2D range
std::array<double, 2>  min  ={-5.0, -5.0 };
std::array<double, 2>  max  ={ 5.0,  5.0 };

auto data_d = Generator.Sample<device>(Gaussians,min, max, ntrials );
auto data_h = Generator.Sample<host>(Gaussians,min, max, ntrials );

Multidimensional PDF sampling

HIST_2D_gaussians_DEVICE

HIST_2D_gaussians_HOST

Same algorithm and functor invoked on the device and on the host.
Iterable container is filled on the memory space of the execution backend.
To perform 10 million trials takes:
- GeForce Titan-Z : 50 ms
- Intel i7 4 cores @ 3.60GHz : 600 ms

Phase-Space Monte Carlo

Hydra supports the production of phase-space Monte Carlo samples. The generation is managed by the class hydra::PhaseSpace and the storage is managed by the specialized container hydra::Events.

Policies to configure the underlying random number engine and the backend used for the calculation.
No limitation on the number of final states.
Support the generation of sequential decays.
hydra::Events is iterable and therefore is fully compatible with C++11 iteration over range semantics.
Generation of weighted and unweighted samples.

The Hydra phase-space generator supersedes the library MCBooster, from the same developer.

Phase-Space Monte Carlo

// B0 -> J/psi K pi
Vector4R B0(5.27961, 0.0, 0.0, 0.0);
std::vector<double> 
massesB0{3.096916, 0.493677, 0.13957018 };

// PhaseSpace object for B0-> K pi J/psi
PhaseSpace<3> phsp(B0.mass(), massesB0);
// container
Events<3, device> Evts_d(10e7);
// generate...
phsp.Generate(B0, Evts_d.begin(), Evts_d.end() );

//copy events to the host
Events<3, host> Evts_h(Evts_d);

for(auto event: Evts_h)
{ /*fill histogram*/}

PHSP

Time to generate 10M events:

GeForce Titan-Z: 20 ms
Intel i7 4 cores @ 3.60 GHz: 413 ms

Multidimensional numerical integration

Hydra provides two MC based methods for multidimensional numerical integration: “Plain Monte Carlo” and self-adaptive Vegas-like algorithm (importance sampling).

Hydra implementations follow closely the corresponding GSL algorithms.
Methods can be configured via template parameters (policies) to call the integrand on the host or on the device and use different random number engines.
Both methods use RAII to acquire, initialize and release the resources.
Example of Vegas usage:

//Vegas-state hold the resources for performing the integration
VegasState<1> state =  VegasState<1>( min, max);
// tune the algorithm
state->SetAlpha(1.75);
state->SetIterations(5);
state->SetMaxError(1e-3);
//10,000 maximum function calls in maximum 5 iterations
Vegas<1> vegas( state, 10000);

Interface to Minuit2 and fitting

Hydra implements an interface to Minuit2 that parallelizes the FCN calculation. This accelerates dramatically the calculation over large datasets.

The fit parameters are represented by the class hydra::Parameter and managed by the class hydra::UserParameters.
hydra::UserParameters has the same semantics of Minuit2::MnUserParameters.
Any positive definite Hydra-functor can be converted into PDF.
The PDFs are normalized on-the-fly.
The estimator is a policy in the FCN.
Data is passed via iterators. Any iterable container can be used. The developer advises to use hydra::PointVector.

The FCN provided by Hydra can be used directly in Minuit2.

Interface to Minuit2 and data fitting

Parameter definition syntax is very intuitive:

// 1) using the Named Parameter Idiom
Parameter  mean  = Parameter::Create().Name("Mean").Value(3.0).Error(0.001).Limits( 1.0, 4.0);
Parameter  sigma = Parameter::Create().Name("Sigma").Value(0.5).Error(0.001).Limits(0.1, 1.5);

// 2) using unnamed parameter idiom
Parameter mean("Mean", 3.0, 0.001, 1.0, 4.0) ;
Parameter sigma("Sigma", 0.5, 0.001, 0.1, 1.5) ;

Pass the parameters to Hydra and Minuit2:

hydra::UserParameters upar;
upar.AddParameter(&mean);
upar.AddParameter(&sigma);

//Printing the registered parameters
upar.PrintParameters();

Interface to Minuit2 and data fitting

Defining the functor:

struct Gauss:public BaseFunctor<Gauss,double, 2>
{
Gauss(Parameter const& mean, Parameter const& sigma, GUInt_t position=0 ):
  BaseFunctor<Gauss,double,2>(),
  fPosition(position),
  fM(mean),
  fS(sigma)
  { RegistryParameters({&fM, &fS}); /* registry the parameters here */ }

// Note: Copy ctor and operator= definitions are mmited for brevity

  template<typename T>
    __host__ __device__    inline double Evaluate(T* x)
    {    return exp(-((x[fPosition] - fM) * (x[fPosition] - fM)) / (2 * fS * fS));}

    GUInt_t  fPosition;
    Parameter fM;
    Parameter fS;
};
// Associate the functor to a integrator and make a pdf.
auto Gauss_PDF   = make_pdf(Gauss(mean, sigma), integrator);

Interface to Minuit2 and data fitting

Complete example:

Model = Gaussian1 + Gaussian2 + Exponential
Unbinned maximum likelihood extended fit
Data: 10 Million events.

//Allocate memory to store data 
PointVector<device, double, 1> data_d(nentries);

//Fill data container...

//Get the FCN
auto modelFCN = make_loglikehood_fcn(model, data_d.begin(), data_d.end() );
//Strategy
MnStrategy strategy(1);
//Create the Migrad minimizer
MnMigrad migrad(modelFCN, upar.GetState(),  strategy);
//Perform the fit
FunctionMinimum minimum = migrad();

Interface to Minuit2 and data fitting

Fit

Black dots: Data points
Red line: Fit result
Blue shadowed area: events sampled from fit result.

Time spent performing the fit:

GeForce Titan-Z: 1.6 s
Intel i7 4 cores @ 3.60 GHz: 20 s

Testing

OpenMP backends (Virtual Machines):

Fedora 23: kernel-4.7.7 and gcc-5.3.1
Fedora 24: kernel-4.7.7 and gcc-6.2.1
Linux Mint 18 “Sarah”: kernel-4.4 and gcc-5.4.0
CERN Centos 7: kernel-3.10.0 and gcc-4.8.5

CUDA

Fedora 23: kernel-4.7.7, gcc-5.3.1, cuda-8.0, arch Kepler (Titan Z)
Fedora 23: kernel-4.7.7, gcc-5.3.1, cuda-8.0, arch Quadro (K2200)
Fedora 23: kernel-4.7.7, gcc-5.3.1, cuda-8.0, arch Maxwell (GTX970)
Fedora 23: kernel-4.7.7, gcc-5.3.1, cuda-8.0, arch Maxwell (GTX980M)
SUSE: kernel-4.8.0(?) and gcc-5.4.0, cuda-8.0, arch Pascal (GTX1060)

Summary

Hydra development is supported by NSF. The project is hosted on GitHub:
- Official: https://github.com/MultithreadCorner/Hydra
- Doxygen: https://multithreadcorner.github.io/Hydra/
- Development fork: https://github.com/AAAlvesJr/Hydra
The package includes a suite of examples.
The next version will expand the range of options for data fitting, and implement convolution and histograming-related functionalities.
Some algorithms are being re-optimized as well.

Please, visit the page of the project, give a try, report bugs, make suggestions and if you like it give a star. Thanks !