ATLAS HPC Sites: Toward Common Solutions

Europe/Amsterdam
13/3-005 (CERN)

13/3-005

CERN

20
Show room on map
Description

This meeting will focus on working out the details of how to continue pushing toward common solutions at large HPC facilities. This includes discussing Harvester(+Panda), Event Service, and Containers for Software Distribution.

    • 14:00 14:40
      Harvester Discussion

      During this discussion we should talk about
      - What features need to be added to or tested in Harvester before it should be passed on to OLCF, NERSC, and BNL?
      - Are there pieces in Panda still missing that will help support HPC type queues?
      - How do we handle planned outages without moving jobs around? We should support short outages (<= 1 day) where jobs should be diverted and long outages (> 1 day) where jobs can remain in place.

    • 14:40 15:10
      Data Motion

      Discuss:
      - Globus Online tools for Harvester and Rucio
      - Dual End-points

    • 15:10 16:00
      Software Distribution

      In this section we should discuss:
      - How will the production of containers work?
      - What releases will be included?
      - How do we distribute them?
      - How do we deal with the time lag between new releases appearing on CVMFS and having them on the site via container?