DOMA / ACCESS Meeting

Europe/Zurich
513/1-024 (CERN)

513/1-024

CERN

50
Show room on map
Frank Wuerthwein (Univ. of California San Diego (US)), Ilija Vukotic (University of Chicago (US)), Markus Schulz (CERN), Stephane Jezequel (LAPP-Annecy CNRS/USMB (FR)), Xavier Espinal (CERN)

People at CERN: Markus Schulz, Andrea Sciaba, Riccardo Di Maria, Xavier Espinal, David Smith, Oxana Smirnova

People remotely: Andrew Hanushevsky, Bo Jayatilaka, Carlos Perez Dengra, Frank Wuerthwein, Ilija Vukotic, Johannes Elmsheuser, Jose Flix Molina, Laurent Duflot, nhardi, Nikolai Marcel Hartmann, Stephane Jezequel

 

  • David Smith - XCache testing infrastructure:
    - MockData: load generation based on read replay with integrity check
    - server (coordinator) + multi-client (load-generators) architecture
    - XRootD server (data-source) to present an unlimited number of files (mock-storage)
    - XRootD transfers of files where number of clients (thus hosts) can be raised to provide useful load level
    - file list of 1.43M files (0.84M unique), 1.50PB, from ATLAS Rucio trace files from a Tier-2 over 1 month
    - scaled interval and predictable rate features, useful to test e.g. XCache
    - verbose log messages useful to understand the machinery at work
    - stress test of XCache@CERN up to 600MB/s
    - to try yourself: https://gitlab.cern.ch/dhsmith/mockdata , http://dhsmith.web.cern.ch/dhsmith/MockData/v1.0.0-1/
    - to do: fix any bugs and add useful features; more detailed documentation; bug or feature tracker if there’s sufficient volume of requests
    - potentially this tool is valid for dCache as well

 

  • Andrea Sciabà - Cost model study on cache size/network trade of:
    - site cache simulated by using file access records for CMS sites
    - showing results for CMS T2 SoCal sites
    - MiniAOD and MiniAODSIM only
    - from cost perspective: too large cache → expensive storage; too small cache → too much WAN traffic (generated for files not cached)
    - tried different cache management strategies
    - storage cost = max(cache occupancy) × cost / unit of disk storage
    - network cost = avg(external traffic / time) × cost / unit of bandwidth
    - cost function: total cost = network cost + storage cost
    - suggestion to add an additional cost for a more realistic estimation (CPU inefficiency)
    - optimal point critically depending on access patterns and the scale of the dominant workloads and on the cost scenarios at the site
    - interesting to compare with the actual costs for the actual production cache at SoCal
There are minutes attached to this event. Show them.