- Compact style
- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
Long discussion on how to define and assign work. These suggestions were made:
Alessandra pointed out that we will also need some infrastructure where to conduct tests and measurements. Markus mentioned that the UP team at CERN has a performance evaluation testbed that can be used. Alessandra also stressed the need for the tasts to be as specific and concrete as possible.
Markus proposed to put as one of the first tasks to identify the most important workloads (to be done by people from the experiments).
According to Markus, another initial task could be to look at both open and commercial performance analysis tools, which could produce an enormous number of metrics, and see which ones are most relevant for us.
Johannes asked if metrics should be measured only in a controlled environment or also from the production environment. Andrea Sciabà answers that one should do both things and compare the results, as any significant discrepancy should be understood. For example, as Johannes said, data access can have a strong effect it can be very different between a lab setup and the WLCG infrastructure. This can be very tricky to model.
It is agreed that a controlled environment is anyway essential to really understand the application's behaviour.
Markus presents his and Andrea's ideas on performance, efficiency and cost. He points out that in the last few years a lot of progress was done in studying the behaviour of jobs and metrics as a function of time. This should make easier to build models of workloads (and full workflows) taking into account their time structure.
It is very important to be able to find out what are the performance limiting factors (which includes bottlenecks and resource starvation) and to measure how much of the hardware capabilities we are exploiting (e.g. how far we are from the theoretical limit on the number of instructions per core). This is not always trivial: for example a CPU that looks fully loaded could be stalled by memory access.
Jan proposes to work on two types of models: one to describe the hardware in abstract terms and one to relate the workload to the hardware model. These models should start simple, using a bottom-up approach to a simple application. Similarly, workloads should be split into smaller units.
David asks which concrete actions could be defined for the experiments. Markus proposes to identify one particular workflow and study it in a detailed way as the main program for the first year. Andrea reminds that in parallel it would be very important to classify the most important workflows.
The time and frequency of future meetings is briefly discussed. Andrea proposes to have fortnightly meetings on Wednesdays at 17:00 CERN time; no objection is made but the proposal will be reiterated by email to make sure that people who were absent can express their opinion.