513/1-024 (CERN)



Show room on map
Frank Wuerthwein (Univ. of California San Diego (US)), Ilija Vukotic (University of Chicago (US)), Markus Schulz (CERN), Stephane Jezequel (LAPP-Annecy CNRS/USMB (FR)), Xavier Espinal (CERN)

A. Sciaba, L. Duflot, J. Walder, F. Wuerthwein, O. Smirnova, J. Elmsheuser, X. Espinal, O. Gutsche, L. Sexton-Kennedy, P. Millar, R. Di Maria, Tigran, D. Weitzel, H. Severini, D. Lange, D. Smith

Presentation from Franck Wuerthwein:

   - Extrapolation of number to HL-LGC on data :

   - 0.5 PB of RAW written per year per experiment

  - Processing 0.5 PB over 100 days would require processing 10 PB/day for ATLAS+CMS -> 1 Tb/s reading speed in US T1 (30-40% share)  : THIS IS BIG CHALLENGE

  - Example of optimisation : Minimise disk buffer in Tape@T1 (carousel model) and buffer at remote processing site (like HPC)

-> Co schedule the chain

o Opportunity coming to run tests benefitting from FABRIC project in the US (few 1 Tb/s line coming in next 4 years)

--> Proposal to organise US ATLAS+CMS+WLCG to run a challenge,  over a single day,  with goal to process a total of 10 PB of input data

+ L. Duflot : Very challenging numbers

o Current max performance of ATLAS caroussel is 50 TB/day -> Identify limiting points

o 1 job per 1 GB Raw file -> 10 million jobs per day

+ B. Jayatilaka : Confirm that woul be factor 25 compared to today

Franck wants to focus to  consolidate all the numbers. 

+ O. Sminorva : Extrapolation to how many Tape Drive is needed.

+ L. Duflot : Propose to integrate this exercise with Data Carousel team 

+ Current limitation on TAPE recall -> Propose to factorise different steps for the moment : Example 10 PB input already on DISK.

Next steps (before summer vacation) :

+ Each experiment and WLCG should internally discuss if they want to be part of it. 

+ Collect scalability issues from different components (FTS,Rucio,..)

+ Check if both experiments can currently process 10 PB per day on DISK with all Grid capacity (IO limitation). If not, run derivation with run faster. 

+ Next DOMA Access meeting : Collect numbers in a single document and invite each contributing team to present their vision and issues nowdays with such challenge




There are minutes attached to this event. Show them.
    • 5:30 PM 5:35 PM
      Introduction 5m
      Speakers: Frank Wuerthwein (Univ. of California San Diego (US)), Frank Wuerthwein (UCSD), Ilija Vukotic (University of Chicago (US)), Stephane Jezequel (LAPP-Annecy CNRS/USMB (FR))
    • 5:35 PM 6:05 PM
      Idea for a production focused Data Challenge 30m
      Speakers: Frank Wuerthwein (UCSD), Frank Wuerthwein (Univ. of California San Diego (US))
    • 6:25 PM 6:35 PM
      AOB 10m