Choose timezone
Your profile timezone:
Attendance local: Eddy, Alessandra, Marian, Maarten, Pablo, Nicolo, Lionel, Julia, Valentina, Luca, Andrea, Simone, Alessandro, Alberto remote: David Minutes taken by Nicolo === JIRA actions for February Almost all tasks due for February closed ticket 28 on drupal - Julia will close it ticket 52 on downtime information - still open, because ACE/MyWLCG/SUM all use different algos - discussion needed to agree which one to use ticket 42 - simplification - Luca will update with a draft for the proposal next week ticket 16 - HC schema - moved to March Pablo reminds to check the many tasks for March ==== Valentina presents HammerCloud Slide 8 on Job Templates: Pablo: how to send tests to many sites? Valentina: from single template, define tests which submit tasks ("Ganga jobs", each contains many jobs) to many sites, replacing variables in template. Example template on slide 8: 'inputdata' is CRAB-specific; other variables are for Ganga to choose plugin Demonstration of "Add template" GUI: - parameters for test frequency/duration - parameters for the location of the template/tarball files on the HC host. Need to be copied to HC host by operator/cron job. Marian: how do we set command line args in test? - extraargs parameter Demonstration of existing template in "Template" GUI: - configuration of sites, job pressure (max jobs in queue and running) Pablo: when are CE/SE white lists set? Andrea: for CRAB, taken from 'Site' table in HammerCloud. Requires manual update, tests stop if SE name changes. CEs taken from BDII. Slide 11: Valentina: "Test create" creates the test entry in the DB; "Test generate" writes the Ganga job (only once per test). Valentina: HC polls experiment WMSes for job monitoring, frequency configurable (usually 30 seconds) Julia: how do you keep a constant load of jobs on a site? Andrea: HC monitors jobs; when number of jobs falls below configurable minimum, HC submits a new task Alessandro: functional tests have 1 job per task, so you stay constant at minimum. Stress test have many jobs per task, so you will go above minimum. Julia: how do we avoid too many queries to experiment WMSes if there are too many HC jobs? Simone, Alessandro, Andrea, Nicolo': bulk queries to experiment WMSes Slide 15: Valentina: to calculate some metrics (e.g. CPU/Walltime), HC needs to get information at the end of the job. Implementation depends on the plugin e.g. download log and parse it. To add a new metric, need to update plugin if info is not already available. Slide 16: Alessandro: "Athena nightly build system" tests are to validate nightly releases Slide 17: y axis is number of tests/month --> on avg ~1000 jobs/test Discussion after presentation: Alessandra: Role of Ganga? Valentina: common interface to CRAB/PanDA/DIRAC. Also has local sqlite for job tracking Pablo: could we use Ganga plugins to test different things, in addition to jobs? E.g. run local script to test SRM. Valentina, Maarten: yes, but ganga is designed around jobs. Simone: do you propose to replace Nagios with HC functional tests? Pablo: scheduling functionality is similar, and HC has more reports Andrea: but Nagios is much more powerful for configuring scheduling Valentina: HC copies job status from Ganga DB into HC DB Julia: so jobs status is tracked three times: in experiment WMS, Ganga DB and HC DB. It seems that Ganga is not needed for this, only as common interface. Andrea: Ganga is there for historical reasons, but removing it needs heavy rewriting Luca: to evaluate HC for SAM, compare with what we do with Nagios: - WN tests: HC could do it - CE job submission tests: impossible with HC, cannot submit to specific CEs, HC not designed to do it - other tests (e.g. SRM) Pablo: two topics for next meeting: 1) definition of availability/reliability (currently three systems do it in different way) 2) continue discussion on Nagios Ops AOB? David - I can give summary slides of UK discussion on monitoring Alberto volunteers to take minutes next time