DOMA / ACCESS Meeting

Name: DOMA / ACCESS Meeting
Start: 2020-05-05T17:30:00+02:00
End: 2020-05-05T18:50:00+02:00
Location: CERN

Tuesday 5 May 2020, 17:30 → 18:50 Europe/Zurich

513/1-024 (CERN)

513/1-024

CERN

Show room on map

Frank Wuerthwein (Univ. of California San Diego (US)), Ilija Vukotic (University of Chicago (US)), Markus Schulz (CERN), Stephane Jezequel (LAPP-Annecy CNRS/USMB (FR)), Xavier Espinal (CERN)

Hide

A. Sciaba, L. Duflot, J. Walder, F. Wuerthwein, O. Smirnova, J. Elmsheuser, X. Espinal, O. Gutsche, L. Sexton-Kennedy, P. Millar, R. Di Maria, Tigran, D. Weitzel, H. Severini, D. Lange, D. Smith

Presentation from Franck Wuerthwein:

- Extrapolation of number to HL-LGC on data :

- 0.5 PB of RAW written per year per experiment

- Processing 0.5 PB over 100 days would require processing 10 PB/day for ATLAS+CMS -> 1 Tb/s reading speed in US T1 (30-40% share) : THIS IS BIG CHALLENGE

- Example of optimisation : Minimise disk buffer in Tape@T1 (carousel model) and buffer at remote processing site (like HPC)

-> Co schedule the chain

o Opportunity coming to run tests benefitting from FABRIC project in the US (few 1 Tb/s line coming in next 4 years)

--> Proposal to organise US ATLAS+CMS+WLCG to run a challenge, over a single day, with goal to process a total of 10 PB of input data

+ L. Duflot : Very challenging numbers

o Current max performance of ATLAS caroussel is 50 TB/day -> Identify limiting points

o 1 job per 1 GB Raw file -> 10 million jobs per day

+ B. Jayatilaka : Confirm that woul be factor 25 compared to today

Franck wants to focus to consolidate all the numbers.

+ O. Sminorva : Extrapolation to how many Tape Drive is needed.

+ L. Duflot : Propose to integrate this exercise with Data Carousel team

+ Current limitation on TAPE recall -> Propose to factorise different steps for the moment : Example 10 PB input already on DISK.

Next steps (before summer vacation) :

+ Each experiment and WLCG should internally discuss if they want to be part of it.

+ Collect scalability issues from different components (FTS,Rucio,..)

+ Check if both experiments can currently process 10 PB per day on DISK with all Grid capacity (IO limitation). If not, run derivation with run faster.

+ Next DOMA Access meeting : Collect numbers in a single document and invite each contributing team to present their vision and issues nowdays with such challenge

There are minutes attached to this event. Show them.

- 17:30 → 17:35
  
  Introduction 5m
  
  Speakers: Frank Wuerthwein (Univ. of California San Diego (US)), Frank Wuerthwein (UCSD), Ilija Vukotic (University of Chicago (US)), Stephane Jezequel (LAPP-Annecy CNRS/USMB (FR))
- 17:35 → 18:05
  
  Idea for a production focused Data Challenge 30m
  
  Speakers: Frank Wuerthwein (UCSD), Frank Wuerthwein (Univ. of California San Diego (US))
  
  DOMA-Access-May5th2020.pdf
- 18:25 → 18:35
  
  AOB 10m

Choose timezone

DOMA / ACCESS Meeting

513/1-024

CERN