ROOT I/O Meeting

Europe/Zurich
VIDYO

VIDYO

Attendees:  Brian, Danilo, Pere, Guilherme A, Jim P., Viktor, Zhe Z.

Danilo: About VecCore’s Vector’s I/O, tricking the I/O to see the vector as an array of double works for storage (tested with #ifdef in the user class).   So the solution works, Xavi has a workaround and next we need to apply the same changed as we did for std::array to the VecCore’s vector.

Danilo: Streaming Column-wise std::array.  I upload a couple of parts for the fix.  FCC is testing the patch on their code and so far, so good.  I expect a patch release next Monday.

Pere: However there was a failure this morning.

Danilo: Indeed, there was a missing cleanup that I upload a few minutes ago.

Viktor: I have been fixing some issues.  I added support for multi-map and fixed problem in the template arguments parsing.  Still having to deal with vector of pointer to base class, I have a solution that needs to be implemented.  I am trying to do some benchmarking, I have a few example but would welcome more.

Philippe: Please look at tutorials/*/h1analysis*.  We use this to demonstrate the look and feel of the various interfaces.

Danilo: I will send you some information.

Jim: I expect that the java version is likely to be 4 slower than the C++ version.  In part because of caching effect, it will be hard to get a clear number/comparison with Sparks DataFrame.

Philippe: Can you disable caching?

Viktor: If you use RDD there is no caching.

Jim: I have been using the I/O part I described at the meeting.  It is working great and building on it.  Spending time on the distributed aspect of the project.  

Philippe: I had discussion with Igor and Jin.  I recommend to Igor to implement a class derived from TTree to allow analysis to use the nosql backend..

Jim: Indeed we have two avenue.  One were with upload ROOT data into CouchDB and that’s what Igor is working on.
and I am doing the inverse having the client pull the information directly from ROOT files into FemtoCode.  We are covering all cases with

Danilo: We are also working on adapting TDataFrame to read not only TTree data but also from other columnar file format.

Jim: Using TFile to access the NoSQL database would be extremely difficult due to the concept mismatch.

Philippe; Actually this is not the approach.  Instead the idea is that rather than getting a TTree from a TFile, you could create a TTreeNoSQL based directly on the database information.  See the TTreeSQL examples.  There is a strong match between the TTree concept (cluster of entries, baskets) and the one you described.

Jim: meta data is custom … CouchDB does not (Want to) know anything about the content of the blob …

Zhe: I am implementing parallel (TTreeCache) unzip using TBB and working on making performance comparison with the older implementation.s

Biran: I am worried about the fine grained locking that might be a serialization bottleneck.  So even it is faster, we may still want to improve this.

Danilo:  You might benefit from using the TThreadExecutor.  

Zhe: Thanks.  I indeed already found it and I am using it :)

Zhe: About -fPIC and lz4, I saw that when adding -fPIC some unaligned memory routine.  However when adding it to the official build, the difference.

Danilo: Removing fPIC seems like a non-starter.  ROOT itself will require it.

Philippe: Could you investigate the differences in build in ROOT+fPIC vs regular build.

Brian: Did you try changing compiler version?

Zhe: I am using gcc 4.8

Brian: You should consider using Docker containers with gcc 6.

Brian: Do we have official docker containers?

Pere: No but we have all you need with CVMFS.

Danilo: Then we use Docker containers with CVMFS and it works fine.

Brian: Distracted with CHEP papers.   I will ask DavidA for a written report.

Danilo: zfp a library for floating point compression.  Did anybody look into this before?

Philippe:  No, not really.

 

There are minutes attached to this event. Show them.
    • 16:00 16:40
      Round Table 40m