DOMA / ACCESS Meeting

Name: DOMA / ACCESS Meeting
Start: 2019-12-10T17:30:00+01:00
End: 2019-12-10T18:50:00+01:00
Location: CERN

Tuesday 10 Dec 2019, 17:30 → 18:50 Europe/Zurich

513/1-024 (CERN)

513/1-024

CERN

Show room on map

Frank Wuerthwein (Univ. of California San Diego (US)), Ilija Vukotic (University of Chicago (US)), Markus Schulz (CERN), Stephane Jezequel (LAPP-Annecy CNRS/USMB (FR)), Xavier Espinal (CERN)

Hide

People at CERN: Markus Schulz, Andrea Sciaba, Riccardo Di Maria, Xavier Espinal, David Smith, Oxana Smirnova

People remotely: Andrew Hanushevsky, Bo Jayatilaka, Carlos Perez Dengra, Frank Wuerthwein, Ilija Vukotic, Johannes Elmsheuser, Jose Flix Molina, Laurent Duflot, nhardi, Nikolai Marcel Hartmann, Stephane Jezequel

David Smith - XCache testing infrastructure:
- MockData: load generation based on read replay with integrity check
- server (coordinator) + multi-client (load-generators) architecture
- XRootD server (data-source) to present an unlimited number of files (mock-storage)
- XRootD transfers of files where number of clients (thus hosts) can be raised to provide useful load level
- file list of 1.43M files (0.84M unique), 1.50PB, from ATLAS Rucio trace files from a Tier-2 over 1 month
- scaled interval and predictable rate features, useful to test e.g. XCache
- verbose log messages useful to understand the machinery at work
- stress test of XCache@CERN up to 600MB/s
- to try yourself: https://gitlab.cern.ch/dhsmith/mockdata , http://dhsmith.web.cern.ch/dhsmith/MockData/v1.0.0-1/
- to do: fix any bugs and add useful features; more detailed documentation; bug or feature tracker if there’s sufficient volume of requests
- potentially this tool is valid for dCache as well

Andrea Sciabà - Cost model study on cache size/network trade of:
- site cache simulated by using file access records for CMS sites
- showing results for CMS T2 SoCal sites
- MiniAOD and MiniAODSIM only
- from cost perspective: too large cache → expensive storage; too small cache → too much WAN traffic (generated for files not cached)
- tried different cache management strategies
- storage cost = max(cache occupancy) × cost / unit of disk storage
- network cost = avg(external traffic / time) × cost / unit of bandwidth
- cost function: total cost = network cost + storage cost
- suggestion to add an additional cost for a more realistic estimation (CPU inefficiency)
- optimal point critically depending on access patterns and the scale of the dominant workloads and on the cost scenarios at the site
- interesting to compare with the actual costs for the actual production cache at SoCal

There are minutes attached to this event. Show them.

- 17:30 → 17:35
  
  Introduction 5m
  
  Speakers: Frank Wuerthwein (UCSD), Frank Wuerthwein (Univ. of California San Diego (US)), Ilija Vukotic (University of Chicago (US)), Stephane Jezequel (LAPP-Annecy CNRS/USMB (FR)), Xavier Espinal (CERN)
  
  doma-access-2020.pdf
- 17:35 → 17:55
  
  Xcache testing infrastructure 20m
  
  Speaker: David Smith (CERN)
  
  MockData_DOMA_Access_10Dec2019.pdf
  
  MockData_DOMA_Access_10Dec2019.pptx
- 17:55 → 18:15
  
  Cost model study on cache size / network trade off 20m
  
  Speakers: Andrea Sciaba (CERN), Dr Andrea Sciabà (CERN)
  
  Cost model study on cache size.pdf
  
  Cost model study on cache size.pptx
- 18:15 → 18:20
  
  AOB 5m