513/1-024 (CERN)



Show room on map
Frank Wuerthwein (Univ. of California San Diego (US)), Ilija Vukotic (University of Chicago (US)), Markus Schulz (CERN), Stephane Jezequel (LAPP-Annecy CNRS/USMB (FR)), Xavier Espinal (CERN)

Presents:  Xavier Espinal, Frank Wuerthwein, Ilija Vukotic, Stephane Jezequel, Markus Schulz, Riccardo Di Maria, Andrea Sciabà, Andrea Rizzi, Daniele Spiga, Diego Ciangottini, Gonzalo Merino, Laurent Duflot, Michael Helmut Holzbock, Nikola Hardi, Nikolai Marcel Hartmann, Teng

CMS's Nano-AOD - Andrea Rizzi (INFN Sezione di Pisa, Universita' e Scuola Normale Superiore, Pisa)

  • analysis data access pattern
    - event processing at analysis level different wrt reconstruction
    - analysis “variations” needed 
    - complex event selection (skimming)
  • nanoAOD format
    - ~1-2Kb / event
    - systematic variations not stored
    - full-split (ntuple-like) of collection attributes
    - columnar compression (LZMA) in “baskets”
    - data read by typical analysis ~10% of the per-event size (up to ~30-40%)
  • analysis-skimming steps
    - analysis skimming can typically reduce by a factor 100 the number of events to handle
    - some systematic variations computed only after skimming step
    - for “basket based” compression or without compression, skimming necessary to NOT read the whole column
  • baskets and compression
    - compression is part of the trick to use to store floats with reduced precision
    - better/different: possibility to write a dataformat layer; use opaque types in ROOT 6.18
  • HDD (cold vs warm cache ) vs SSD cold cache comparison in the slides
  • spreadsheet to study IO vs CPU boundaries and costs in the slides
    - IO as bottleneck if using optimised code
    - LZMA remain the best trade-off
    - few advantages having persistent intermediate formats
  • network IO
    - latency hiding to address network latency
    - if data served from a single (or few) HDD, total seek time cannot be hidden
    - in computing centres, nanoAOD are small fraction and can be spread on several disks
    - current experience, network IO at analysis level better handled with “lazy download”-like solutions => need concrete techs to test here
  • conclusions
    - analysis access patterns typically do not bulk process the events
    - analysis access patterns cherry-pick the information to use
    - per-column read-saving in place
    - per-event not possible
    - options for uncompressed formats using opaque ROOT types
    - network access with latency hiding should be demonstrated for analysis use cases
  • see slides for more details 
  • from comments:
    - default basket size 32kB - actual size to be checked
    - full Run-2 data (100%) in nanoAOD (~50TB)
    - factor of 10 difference between CMS (~50TB) and ATLAS
    - nanoAOD targeting ~60-80% of the CMS analyses
    - miniAOD will still be present
    - this topic started with a PB problem to solve - now at the level of TB: necessity to obtain official numbers from computing coordination, wrap up, and rescope in case


HL-LHC review document preparations - Xavier Espinal

  • need input from the DOMA ACCESS community
  • please, contribute to the document
  • community input needed, e.g:
    - XCache initiatives update after 6 months of experience: US, DE, FR, IT: collected metrics and operational experience
    - baseline computing model estimates for HL-LHC data
There are minutes attached to this event. Show them.