People at CERN: Markus Schulz, Andrea Sciaba, Riccardo Di Maria, Xavier Espinal, David Smith, Oxana Smirnova
People remotely: Andrew Hanushevsky, Bo Jayatilaka, Carlos Perez Dengra, Frank Wuerthwein, Ilija Vukotic, Johannes Elmsheuser, Jose Flix Molina, Laurent Duflot, nhardi, Nikolai Marcel Hartmann, Stephane Jezequel
- David Smith - XCache testing infrastructure:
- MockData: load generation based on read replay with integrity check
- server (coordinator) + multi-client (load-generators) architecture
- XRootD server (data-source) to present an unlimited number of files (mock-storage)
- XRootD transfers of files where number of clients (thus hosts) can be raised to provide useful load level
- file list of 1.43M files (0.84M unique), 1.50PB, from ATLAS Rucio trace files from a Tier-2 over 1 month
- scaled interval and predictable rate features, useful to test e.g. XCache
- verbose log messages useful to understand the machinery at work
- stress test of XCache@CERN up to 600MB/s
- to try yourself: https://gitlab.cern.ch/dhsmith/mockdata , http://dhsmith.web.cern.ch/dhsmith/MockData/v1.0.0-1/
- to do: fix any bugs and add useful features; more detailed documentation; bug or feature tracker if there’s sufficient volume of requests
- potentially this tool is valid for dCache as well
- Andrea Sciabà - Cost model study on cache size/network trade of:
- site cache simulated by using file access records for CMS sites
- showing results for CMS T2 SoCal sites
- MiniAOD and MiniAODSIM only
- from cost perspective: too large cache → expensive storage; too small cache → too much WAN traffic (generated for files not cached)
- tried different cache management strategies
- storage cost = max(cache occupancy) × cost / unit of disk storage
- network cost = avg(external traffic / time) × cost / unit of bandwidth
- cost function: total cost = network cost + storage cost
- suggestion to add an additional cost for a more realistic estimation (CPU inefficiency)
- optimal point critically depending on access patterns and the scale of the dominant workloads and on the cost scenarios at the site
- interesting to compare with the actual costs for the actual production cache at SoCal
There are minutes attached to this event.
Show them.