ROOT Parallelisation, Performance and Programming Model Meeting
Present: Enric, Enrico, Guilherme, Xavi, Philippe, Brian, Danilo
Actions
Xavi: redo the plots with proper scaling
Guilherme: add spoecific compiler options to remove warnings. This in directory core/imt. We need to have a general solution for all externals.
Danilo: Get from Dan the workload that exhibited the CPU consuption and fix, if possible, the problerm.
Enric
- New TCsvDS merged to ROOT. Also with tutorials in Python and C++
- Now working on friend trees and friend chains. TTreeProcessorMT needs to be remanipulated. The problem is that when one has chain friends, the decomposition of chains in trees is not valid anymore.
Enrico
- Min and max changed in behaviour: we always returned doubles but we should actually return the right type.
- Detection of wrong filter/define function signature now sfinae friendly
- Received feature requests: 100 cuts and 100 histos to then compare the effect. Doing 100 filters consumes a lot of run time and memory. The jitting at O2 level of all the filters takes 8 minutes while a compiler manages to do that in seconds.
Guilherme
- The papers' authors have been informed. Drafts uploaded. Still time to integrate comments.
Philippe
- Issue with TBB headers: marked as system header to remove warnings.
- Guilherme will add spoecific compiler options to remove warnings. This in directory core/imt. We need to have a general solution for all externals.
- Dan found the merging core still consuming 15% of a core.
Danilo
- The paper's authors have been informed. Drafts uploaded. Still time to integrate comments.
JIT Slowness
This program was taking 27 seconds to execute on a debug build (opt llvm)
int enrico() {
// build a TDF with 1 event and 1 column "x" that is always equal 42
TDataFrame dd(1);
auto d = dd.Define("x", []() { return 42; });
// book nHistos histograms
// all with the same cut and filled with the same variable in this simple example
std::vector<TDF::TResultProxy<TH1D>> histos;
const auto nHistos = 1000u;
histos.reserve(nHistos);
for (auto i = 0u; i < nHistos; ++i)
histos.emplace_back(d.Histo1D("x"));
// run event loop, print something to be sure everything is ok
std::cout << histos.front()->GetMean() << std::endl;
return 0;
}
After reverting (by hand) https://github.com/root-project/root/commit/548eca7 , the same takes 4 seconds.
Long Running tests
Optimised.