ROOT I/O Meeting

Name: ROOT I/O Meeting
Start: 2017-01-20T15:55:00+01:00
End: 2017-01-20T17:20:00+01:00
Location: VIDYO

Friday 20 Jan 2017, 15:55 → 17:20 Europe/Zurich

VIDYO

Hide

Viktor: Working on debugging reading CMSSW files from Java. Added the ability to read from HDFS. One issue I have is pointers (when always null) within the mini-AOD format and the other is empty collections. Also I am not quite clear on how enum and class that are mentioned (in pointers or collection) but never stored are handled.

Philippe: For both enums and never-actually-used-classes, no information is stored in the file. So when finding an ‘unknown’ name either it is either an enum or a never used class. Consequently just assuming it is always an enums is ‘safe’.

Viktor: I am able to process mini-AOD on sparks cluster over HDFS.

Viktor: Don’t know yet what is the next step. There is a small set of outstanding issues. In HADOOP, we want to understand better the issue related to data locality and to best match cpu to disk. Another is support for EOS, CERN’s java implementation is still too basic. I might help IT to flush it out. I will have mini-demo for the February workshop. Another milestone, it it review and possibly rewrite/refactor the existing java ROOT I/O library (in particular to flush out the write-part).

Brian: I got distracted by ice-storms. Planning ‘demo’ of TTreeReaderFast for 2 weeks. 2nd round of IMT improvement ready for final review.

Guilherme A.: I am working on libAfterImage update in ROOT. I updated VecCore to be able to install VC and UMD-SIMD.

Jim: I am working closely with Viktor on the ROOT/Spark project. I have a couple talks for the upcoming workshop and working on flushing out the number for these talks.

Pere: Everybody away at the moment. Danilo, Enric and Enrico coming back next week.

Philippe: TMapFile was resurrected. There was a bug reported by CMS, when there is a class A which has a nested class B and B has a nested class C and last nested class C has const datamembers; this causes in a segmentation fault in TBranchElement::Unroll … I am also working on finalizing the change to ROOT’s libNew and working on thread-safe stack pull request.

Zhe: I have been working on comparison on LZ4 and zlib using hand crafted events with around 6 floating points to 4K events to 20MB events. I ran 3 tests. Uses 3 branches with small, medium and large. For medium, zlib is 20% faster. For large, zlib slight better. For tiny, lz4 is almost 3 time faster than zlib.

Philippe: Did you control for the resulting compression ratio.

Zhe: Not yet.

Brian: I think we need to write this up in the end to document our decision.

Philippe: We should record the test to be able to run with other compression algorithm.

Philippe: Tentative agenda for the February 8th workshop can be found at https://indico.fnal.gov/conferenceDisplay.py?confId=13665

There are minutes attached to this event. Show them.

- 16:00 → 16:40
  
  Round Table 40m
  
  Viktor: Working on debugging reading CMSSW files from Java. Added the ability to read from HDFS. One issue I have is pointers (when always null) within the mini-AOD format and the other is empty collections.   Also I am not quite clear on how enum and class that are mentioned (in pointers or collection) but never stored are handled.
  
  Philippe: For both enums and never-actually-used-classes, no information is stored in the file. So when finding an ‘unknown’ name either it is either an enum or a never used class. Consequently just assuming it is always an enums is ‘safe’.
  
  Viktor: I am able to process mini-AOD on sparks cluster over HDFS.
  
  Viktor: Don’t know yet what is the next step. There is a small set of outstanding issues. In HADOOP, we want to understand better the issue related to data locality and to best match cpu to disk. Another is support for EOS, CERN’s java implementation is still too basic. I might help IT to flush it out.   I will have mini-demo for the February workshop.   Another milestone, it it review and possibly rewrite/refactor the existing java ROOT I/O library (in particular to flush out the write-part).
  
  Brian: I got distracted by ice-storms. Planning ‘demo’ of TTreeReaderFast for 2 weeks.   2nd round of IMT improvement ready for final review.
  
  Guilherme A.: I am working on libAfterImage update in ROOT.   I updated VecCore to be able to install VC and UMD-SIMD.
  
  Jim: I am working closely with Viktor on the ROOT/Spark project. I have a couple talks for the upcoming workshop and working on flushing out the number for these talks.
  
  Pere: Everybody away at the moment. Danilo, Enric and Enrico coming back next week.
  
  Philippe: TMapFile was resurrected. There was a bug reported by CMS, when there is a class A which has a nested class B and B has a nested class C and last nested class C has const datamembers; this causes in a segmentation fault in TBranchElement::Unroll … I am also working on finalizing the change to ROOT’s libNew and working on thread-safe stack pull request.
  
  Zhe: I have been working on comparison on LZ4 and zlib using hand crafted events with around 6 floating points to 4K events to 20MB events. I ran 3 tests. Uses 3 branches with small, medium and large.   For medium, zlib is 20% faster. For large, zlib slight better. For tiny, lz4 is almost 3 time faster than zlib.
  
  Philippe: Did you control for the resulting compression ratio.
  
  Zhe: Not yet.
  
  Brian: I think we need to write this up in the end to document our decision.
  
  Philippe: We should record the test to be able to run with other compression algorithm.
  
  Philippe: Tentative agenda for the February 8th workshop can be found at https://indico.fnal.gov/conferenceDisplay.py?confId=13665