Choose timezone
Your profile timezone:
Attendance
local: Eddy, Alessandra, Marian, Maarten, Pablo, Nicolo, Lionel, Julia, Valentina, Luca, Andrea, Simone, Alessandro, Alberto
remote: David
Minutes taken by Nicolo
===
JIRA actions for February
Almost all tasks due for February closed
ticket 28 on drupal - Julia will close it
ticket 52 on downtime information - still open, because ACE/MyWLCG/SUM all use different algos - discussion needed to agree which one to use
ticket 42 - simplification - Luca will update with a draft for the proposal next week
ticket 16 - HC schema - moved to March
Pablo reminds to check the many tasks for March
====
Valentina presents HammerCloud
Slide 8 on Job Templates:
Pablo: how to send tests to many sites?
Valentina: from single template, define tests which submit tasks ("Ganga jobs", each contains many jobs) to many sites, replacing variables in template.
Example template on slide 8: 'inputdata' is CRAB-specific; other variables are for Ganga to choose plugin
Demonstration of "Add template" GUI:
- parameters for test frequency/duration
- parameters for the location of the template/tarball files on the HC host. Need to be copied to HC host by operator/cron job.
Marian: how do we set command line args in test?
- extraargs parameter
Demonstration of existing template in "Template" GUI:
- configuration of sites, job pressure (max jobs in queue and running)
Pablo: when are CE/SE white lists set?
Andrea: for CRAB, taken from 'Site' table in HammerCloud. Requires manual update, tests stop if SE name changes. CEs taken from BDII.
Slide 11:
Valentina: "Test create" creates the test entry in the DB; "Test generate" writes the Ganga job (only once per test).
Valentina: HC polls experiment WMSes for job monitoring, frequency configurable (usually 30 seconds)
Julia: how do you keep a constant load of jobs on a site?
Andrea: HC monitors jobs; when number of jobs falls below configurable minimum, HC submits a new task
Alessandro: functional tests have 1 job per task, so you stay constant at minimum. Stress test have many jobs per task, so you will go above minimum.
Julia: how do we avoid too many queries to experiment WMSes if there are too many HC jobs?
Simone, Alessandro, Andrea, Nicolo': bulk queries to experiment WMSes
Slide 15:
Valentina: to calculate some metrics (e.g. CPU/Walltime), HC needs to get information at the end of the job.
Implementation depends on the plugin e.g. download log and parse it.
To add a new metric, need to update plugin if info is not already available.
Slide 16:
Alessandro: "Athena nightly build system" tests are to validate nightly releases
Slide 17: y axis is number of tests/month --> on avg ~1000 jobs/test
Discussion after presentation:
Alessandra: Role of Ganga?
Valentina: common interface to CRAB/PanDA/DIRAC. Also has local sqlite for job tracking
Pablo: could we use Ganga plugins to test different things, in addition to jobs? E.g. run local script to test SRM.
Valentina, Maarten: yes, but ganga is designed around jobs.
Simone: do you propose to replace Nagios with HC functional tests?
Pablo: scheduling functionality is similar, and HC has more reports
Andrea: but Nagios is much more powerful for configuring scheduling
Valentina: HC copies job status from Ganga DB into HC DB
Julia: so jobs status is tracked three times: in experiment WMS, Ganga DB and HC DB. It seems that Ganga is not needed for this, only as common interface.
Andrea: Ganga is there for historical reasons, but removing it needs heavy rewriting
Luca: to evaluate HC for SAM, compare with what we do with Nagios:
- WN tests: HC could do it
- CE job submission tests: impossible with HC, cannot submit to specific CEs, HC not designed to do it
- other tests (e.g. SRM)
Pablo: two topics for next meeting:
1) definition of availability/reliability (currently three systems do it in different way)
2) continue discussion on Nagios Ops
AOB?
David - I can give summary slides of UK discussion on monitoring
Alberto volunteers to take minutes next time