EP R&D Software Working Group Meeting

Name: EP R&D Software Working Group Meeting
Start: 2020-04-08T10:00:00+02:00
End: 2020-04-08T11:00:00+02:00
Location: Vidyo

Wednesday 8 Apr 2020, 10:00 → 11:00 Europe/Zurich

Vidyo

Hide

Software R&D Working Meeting Minutes

Introduction

Hardware
- Would like to buy a box also for reconstruction
- Would the spec for the simulation box meet these needs?
  - Hadn’t foreseen GPU, but would be useful strategically and for HGCAL
  - At the moment the HGCAL have their needs covered internally via CMS resources
- ACTION: Include Andi, Moritz, Marco and Felice in the discussion with IT
- Do we want Intel CPUs in the suite of R&D machines?
  - Probably these will come in the Analysis machine
Next meeting
- Agreed to cover HGCAL reconstruction in June
- Will decide on the date soon

Analysis Systems

DAOS is an Intel storage system based to replace the cluster filesystems in data centres
- SSD based, so the high performance part of the storage heirarchy
- Can emulate a filesystem, but for highest performance use needs to be addressed as an object store
Object granularity
- To early to say what will be best (pages or clusters), or if one-size-fits-all is possible
How to interface to the data management layer?
- Will add metadata to what is stored and this will have a namespace associated with it
- Too early to say exactly what the interface to the data management layer would be (and out of scope for us to tackle it right now)
  - Will need to expose things at the correct level of granularity (unlikely the DM system wants to know about 10kB pages)
- Do plan to get away from the file notion as central, from the analysis side
- RNTuples are stored inside the current TFile objects, but this is a lightweight bootstrap
For Xrootd the XCache layer would be good to look at
- Contact Andy Hanusheveky
Snapshots of intermedate analysis
- Suggested to enable this behind the scenes (user doesn’t need to know)
- Where to store these results?
  - Local SSD: very fast, but then not accessible to the rest of the analysis nodes (workload scheduling problem)
  - ClusterFS: accessible to the whole cluster, but may be performance limited
- Spark has done interesting work on this (resilent datasets)
- Parsl does this caching by hashing the Python code and the calling parameters, storing the intermediate results in files
  - Separation of caching input data from the processed outputs would be advantageous

There are minutes attached to this event. Show them.

- 10:00 → 10:10
  
  Introduction 10m
  
  Speakers: Graeme A Stewart (CERN), Jakob Blomer (CERN)
  
  WP7 Working Meeting Intro 2020-04-08.pdf
- 10:10 → 10:40
  
  Efficient Analysis Systems 30m
  
  Speakers: Enric Tejedor Saavedra (CERN), Jakob Blomer (CERN), Vincenzo Eduardo Padulano (Valencia Polytechnic University (ES))
  
  EP R&D Software - Efficient Analysis Facilities.pdf
- 10:40 → 11:00
  
  Roundtable and Discussion 20m

Choose timezone

EP R&D Software Working Group Meeting

Vidyo

Software R&D Working Meeting Minutes

Introduction

Analysis Systems

Share this page

Direct link

Social networks

Calendaring