ROOT I/O Meeting

Europe/Zurich
32/1-A24 (CERN)

32/1-A24

CERN

40
Show room on map
Brian Paul Bockelman (University of Nebraska-Lincoln (US)), Philippe Canal (Fermi National Accelerator Lab. (US))

Brian: The way CMS does prefetching will suffer from issue similar to ATLAS’ issue.  If we have multiple thread processing events, it is possible that a thread can seek backward by a couple of event as far as ROOT is concerned.  So it should be possible for thread requesting data from the ‘previous’ cluster, this leads, in the best case, to a un-cached read or might trashed the cache.

Peter: See https://github.com/root-project/root/pull/1065.  But at the moment we don’t do as much multi-threading as CMS.
Atlas had two different problems.  Case of the trailing baskets where one branch has a small last basket in the cluster, in which case some branch might already be in the next cluster while that branch is 2 baskets behind.  
In addition, we had the issue that often our cluster had 2 baskets per branches.

So the setting are
    MaxVirtualSize to express how many cluster to keep in the cache.
    SetClusterPrefetch to fetch from the cache more basket early.

Peter: what is good to monitor is the number of bytes read as the ‘bad scenario’ can lead to either non-vector-reads *or* redundant vector-reads.

Brian: Serialization of PyObject in ROOT file?   Eg. embedding of the pickled version?

Philippe: Yes, it would be interesting to pursue, likely with a student/gsoc.

Brian: High latency I/O is a dark art.

Philippe: Can you be more specific about the use case?

Brian: Yes, the cases of bad training.  What about large skips and rewinds?  In CMS we found a side case of what happens if you start reading toward the end and then go back to the beginning.

Brian: We see immediate dead-lock when we turn on the asynchronous prefetching.

Philippe: This is ‘new’ to me, Can you reproduce it and send the dead-lock stack-trace.

Chris: We should review our code to see what is now redundant.

Brian: There is a PR 240.  

Brian: So I will make a presentation on where the ‘default’ can be changed to make hand tuning less needed.

Philippe: Oksana what is going on with Fons’ odd behaving file.

Oksana: nothing suspicious in the file.   If you look at the size of the baskets they are pretty constant.   If you recompress the file with the same lzma compression level, the basket size range is much larger.   They have small very very small basket.  Maybe they need better settings.  

Oksana: I will prepare a presentation summarizing my findings but in short this is them picking the ‘wrong’ compression level and not paying attention to I/O read/write time.

Zhe: Working on stack-tracing on MacOS this is challenging …..  I will look at lldb’s implementation.

 

There are minutes attached to this event. Show them.
    • 16:00 16:20
      Round Table 20m