Analysis Queues Performance

America/Chicago
Vidyo

Vidyo

Ilija Vukotic (Universite de Paris-Sud 11 (FR))
Description
There is a need to understand performance of ATLAS analysis queues, and eventually improve it. We will start with US sites and try to document our findings. Hopefully our experience will help to all ATLAS grid sites. The two things we will try to steer clear off: 1. Micromanaging sites. 2. Micromanaging ATLAS code. We'll use continuously running HC functional test jobs to look at performance of each site, try to understand any features we observe. While direct comparison between sites is not possible, we'll try to bring them all to it's optimal performance.
people: David Lesny, Shawn, Wei, Sarah, Rob, Saul, Patrick McGuigan, Hiro.

summary
strange thing with AGLT2: direct access helps but 24 core machines are not having any tests.
HU: quite bad performance. will need a one-on-one discussion
BNL: long pre-stage time. need understanding
MWT2: no jobs submitted since 1st Jun. A lot of empty results since 8th May.

Patrick wants to know if testing is occuring at the SWT2 analysis sites:
ANALY_SWT2_CPB and ANALY_OU_OCHEP_SWT2

we should start with switching direct access/pre-stage tests

we should have plots of currently running and currently queued jobs for all the sites.

proposal: have a set of several plots automaticaly produced for all the sites what would be fast to get and easy to digest. proposal is once a week. could make people subscribe to it. (rss?)

think about automatic e-mail messaging in case of serious performance problems.

There are minutes attached to this event. Show them.
    • 14:00 14:10
      Introduction 10m
      Speaker: Ilija Vukotic (Universite de Paris-Sud 11 (FR))
      Slides
    • 14:10 14:30
      Test results - Current state 20m
      Speaker: Ilija Vukotic (Universite de Paris-Sud 11 (FR))
      Slides
    • 14:30 15:00
      AOB